[XML-SIG] Re: Parsing a unicode string

Fred L. Drake, Jr. fdrake at acm.org
Wed Oct 6 16:58:50 CEST 2004


On Wednesday 06 October 2004 03:37 am, konrad.hinsen at laposte.net wrote:
 > What does one gain by marrying XML to byte streams? If some day in the
 > future 32-bit units becomes the smallest useful ones in computing, this
 > will just cause compatibility headaches.

All serialization formats end up being tied to byte streams.  Files on disk 
are byte streams.  Data comes over a socket as a byte stream.

It's perfectly ok for the encoding used to be a 16-bit encoding; those should 
work just fine for XML.

I think it's possible for a low-level parser API to accept Python Unicode 
objects and use the internal encoding for them.  One catch is that for each 
additional chunk of data, it needs to always check that it gets Unicode.  
Tedious, but doable.  (I just haven't had time to do this for the Expat 
bindings yet.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>



More information about the XML-SIG mailing list