[Python-3000] XML as bytes or unicode?
"Martin v. Löwis"
martin at v.loewis.de
Mon Aug 25 07:37:41 CEST 2008
> Well, does the parser handle it or should the code that got the XML in
> the first place handle it?
The parser handles encodings in XML; XML parsing is "bytes in, pieces of
Unicode out".
> Apparently whomever wrote the parsers originally thought it was not
> the parser's job. =)
Why do you think so? In Python, the XML parsers have always supported
encoding declarations.
Parsing Unicode XML strings isn't quite that meaningful.
> If someone wanted to you could possibly dispatch on bytes to some code
> that tried to determine the encoding and do the proper decode before
> proceeding.
That's the parser's job (and one that expat does correctly).
Regards,
Martin
More information about the Python-3000
mailing list