SOT : & in XML-documents
2002 at weholt.org
Tue Oct 8 21:49:13 CEST 2002
Damn!! The original data is filled with it. My solution so far has been to
keep a list ( limited to '&' so far ) of characters to replace ( '&' is
replaced with 'and' etc. ).
Are there any other characters I must avoid/replace?
Thanks for your help.
"Henrik Motakef" <henrik.motakef at web.de> wrote in message
news:87bs64903u.fsf at pokey.henrik-motakef.de...
> "Thomas Weholt" <2002 at weholt.org> writes:
> > I'm trying to parse an old fileformat into xml. The problem is that the
> > character & appears from time to time in the original file.
> > Anybody got any clues on how to avoid problems with characters like
> Don't use them ;-) Or, better, proberly escape them as &. This is
> not an issue of the charset, so no XML declaration will save you.
> If you are dealing with HTML, you could use tidy (google will find it
> for you) to create well-formed XML. IIRC there is also a shareware
> program that tries to clean up broken XML regardless of it's document
> type, probably called "XML tidy" or some such.
> Good luck
More information about the Python-list