[XML-SIG] cStringIO

Fred L. Drake, Jr. fdrake@acm.org
Fri, 25 May 2001 16:39:52 -0400 (EDT)


Martin v. Loewis writes:
 > One issue of reading UTF-8, whether from cStringIO or elsewhere, might
 > break result strings inside a character (i.e. between character
 > boundaries). So be careful with applying unicode() or .decode on such
 > a string - you may have to save some bytes for the next .read() call.

  Correct -- the cStringIO object is just a stream of bytes, like a
file object.  To read characters, you'll need to wrap it with a
decoder using the codecs module, or pass the bytes to a parser that
can handle them properly (like Expat).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations