[Python-bugs-list] [ python-Bugs-547537 ] cStringIO mangles Unicode

noreply@sourceforge.net noreply@sourceforge.net
Fri, 26 Apr 2002 14:08:17 -0700


Bugs item #547537, was opened at 2002-04-23 08:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=547537&group_id=5470

Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: M.-A. Lemburg (lemburg)
Summary: cStringIO mangles Unicode

Initial Comment:
The last few comments added to bug 216388 indicate a
new problem in cStringIO. Rather than abusing that bug
report, I'm opening a new one here. The problem is that
cStringIO now accepts Unicode strings to write(), but
when you use this, getvalue() returns binary garbage.
The cause is apparently MAL's checkin for cStringIO
2.30, which enabled read buffers.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 17:08

Message:
Logged In: YES 
user_id=6380

Should I just check this in? It looks pretty safe to me...

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 08:59

Message:
Logged In: YES 
user_id=6380

I wonder if perhaps the fix is as simple as using "t#"
instead of "s#" in the PyArg_... format string in P_write().
That accepts Unicode strings as args to write() only when
they are ASCII (actually, it uses the default encoding).

Marc-Andre, can you explain the reason for the change in the
first place (other than fixing a dubious dependency on
PyString_GetSize() raising an exception for a non-string
object)?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=547537&group_id=5470