[XML-SIG] Re: Parsing a unicode string
Fred L. Drake, Jr.
fdrake at acm.org
Wed Oct 6 16:58:50 CEST 2004
On Wednesday 06 October 2004 03:37 am, konrad.hinsen at laposte.net wrote:
> What does one gain by marrying XML to byte streams? If some day in the
> future 32-bit units becomes the smallest useful ones in computing, this
> will just cause compatibility headaches.
All serialization formats end up being tied to byte streams. Files on disk
are byte streams. Data comes over a socket as a byte stream.
It's perfectly ok for the encoding used to be a 16-bit encoding; those should
work just fine for XML.
I think it's possible for a low-level parser API to accept Python Unicode
objects and use the internal encoding for them. One catch is that for each
additional chunk of data, it needs to always check that it gets Unicode.
Tedious, but doable. (I just haven't had time to do this for the Expat
bindings yet.)
-Fred
--
Fred L. Drake, Jr. <fdrake at acm.org>
More information about the XML-SIG
mailing list