error when parsing xml
R.Brodie at rl.ac.uk
Mon Sep 5 15:56:50 CEST 2005
> > I have found that some people refuse to stick to standards, so whenever I
> > parse XML files I remove any characters that fall in the range
> > <= 0x1f
> >>= 0xf0
> Now of what help shall that be? Get rid of all accented characters?
> Sorry, but that surely is the dumbest thing to do here - and has
> _nothing_ to do with standards!
Earlier versions of the Microsoft XML parser accept invalid characters
(e.g. most of those < 0x1f). Sadly, you do find files in the wild that need
to have these stripped before feeding them to a conforming parser.
One can be too enthusiastic about the process, though.
More information about the Python-list