XML can't read Unicode shock horror. News at 11.
Walter Dörwald
walter at livinglogic.de
Wed Oct 31 12:17:27 EST 2001
Martin von Loewis wrote:
> Dale Strickland-Clark <dale at riverhall.NOTHANKS.co.uk> writes:
>
>
>>I see that this is probably the same as Python bug #216388 which has
>>been around for over a year and been given a low priority (3).
>>
>
> It is not the same bug. Even if cStringIO supported Unicode objects,
> expat would still require byte strings.
>
>
>>Non-unicode XML is a bit restrictive. :-(
>>
>
> Why do you think so? XML documents are byte sequences, not character
> strings. The *content* is Unicode; the document is not.
But xml.sax.xmlreader.InputSource provides methods setCharacterStream
and getCharacterStream, to be able to parse something which is already a
decoded unicode character stream.
Does any of the available parsers support this?
Bye,
Walter Dörwald
More information about the Python-list
mailing list