[ python-Bugs-1235646 ] codecs.StreamRecoder.next doesn't encode
SourceForge.net
noreply at sourceforge.net
Fri Sep 2 20:33:54 CEST 2005
Bugs item #1235646, was opened at 2005-07-10 18:55
Message generated for change (Comment added) made by doerwalter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1235646&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Sebastian Wangnick (wangnick)
Assigned to: Walter Dörwald (doerwalter)
Summary: codecs.StreamRecoder.next doesn't encode
Initial Comment:
Codecs.StreamRecode.next does't encode the data it
gets from self.reader.next. This breaks the "for line in
codecs.EncodedFile(...)" idiom.
----------------------------------------------------------------------
>Comment By: Walter Dörwald (doerwalter)
Date: 2005-09-02 20:33
Message:
Logged In: YES
user_id=89016
OK, now I'm beginning to understand the docstring.
Nevertheless I think having a class that uses stateful
codecs at both ends would be useful. If you want, I can give
this a try (after I'm back from vactation in four weeks).
Closing the report as fixed.
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2005-09-01 20:28
Message:
Logged In: YES
user_id=38388
Thanks, Walter.
StreamRecorder is not broken: it works as advertised (see
the .__init__() doc-string and interface) and yes, this
means that only stateless encodings can be used, such as
e.g. UTF-16-LE, simply because the encode and decode
functions are defined as being stateless.
----------------------------------------------------------------------
Comment By: Walter Dörwald (doerwalter)
Date: 2005-09-01 14:22
Message:
Logged In: YES
user_id=89016
Checked in as:
Lib/codecs.py 1.48/1.35.2.10
I'll try to add tests for StreamRecoder tomorrow.
StreamRecoder is broken in its current form, as it uses the
stateless codec for the frontend encoding. Recoding from
e.g. latin-1 to utf-16 will return a BOM for every call to
read() which is clearly wrong. What gets read from the
backend stream should be pushed through a *stateful*
encoder. BTW, a feed style API would help here ;)
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2005-08-31 22:58
Message:
Logged In: YES
user_id=38388
Looks good, Walter.
Please check it in.
Thanks.
----------------------------------------------------------------------
Comment By: Walter Dörwald (doerwalter)
Date: 2005-08-31 18:11
Message:
Logged In: YES
user_id=89016
Here's a simple patch
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1235646&group_id=5470
More information about the Python-bugs-list
mailing list