[XML-SIG] cStringIO

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Fri, 25 May 2001 22:27:28 +0200

> It looks to me (from skimming the code in cStringIO.c), that the code
> is 8bit transparent.  I thought UTF-8 made all multi-byte values have
> the 8th bit on.  So, if I'm using cStringIO I should be okay, if I'm
> just using cStringIO to transport data, or maybe do readline or
> similar.  Once I need to look at individual characters, I'm hosed.  But
> if I want to collect the value ofa bunch of TEXT_NODE elements and
> output them, wont' that work?

Depends on how exactly you do that. If you just write the text.data
attribute to the cStringIO, it might fail, if text.data is a Unicode
object (please note that a string object that is UTF-8-encoded is
*not* a Unicode object, it is a byte string).

To see the problem, do

import cStringIO 
o = cStringIO.StringIO()
o.write(u"My 0.02\N{EURO SIGN}")