error when parsing xml
Diez B. Roggisch
deets at nospam.web.de
Mon Sep 5 09:06:44 EDT 2005
> I have found that some people refuse to stick to standards, so whenever I
> parse XML files I remove any characters that fall in the range
> <= 0x1f
>
>>= 0xf0
Now of what help shall that be? Get rid of all accented characters?
Sorry, but that surely is the dumbest thing to do here - and has
_nothing_ to do with standards! Charactersets with codepoints > 128 are
pretty common and well standarized, just not "ascii". I suggset you read
up on the topic of unicode & encodings a bit - and then fix some code of
yours...
Diez
More information about the Python-list
mailing list