Pythjon and XML

Lars Marius Garshol larsga at
Wed Feb 2 12:55:30 CET 2000

* Fredrik Lundh
| quick answer: the built-in string type only handles 8-bit
| characters, and most XML parsers (e.g. xmllib) cannot handle
| anything that doesn't use an 8-bit encoding.  in other words, ASCII,
| ISO Latin, UTF-8 (etc) works fine, but 16-bit encodings don't.

It's worth noting that Pyexpat supports UTF-16, and sends output to
Python applications as UTF-8. RXP also supports UTF-16 and also the
rest of the ISO 8859-x character sets. I assume it also sends output
as UTF-8.

--Lars M.

More information about the Python-list mailing list