[ python-Bugs-1235646 ] codecs.StreamRecoder.next doesn't encode

SourceForge.net noreply at sourceforge.net
Fri Sep 2 20:33:54 CEST 2005


Bugs item #1235646, was opened at 2005-07-10 18:55
Message generated for change (Comment added) made by doerwalter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1235646&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Sebastian Wangnick (wangnick)
Assigned to: Walter Dörwald (doerwalter)
Summary: codecs.StreamRecoder.next doesn't encode

Initial Comment:
Codecs.StreamRecode.next does't encode the data it 
gets from self.reader.next. This breaks the "for line in 
codecs.EncodedFile(...)" idiom.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2005-09-02 20:33

Message:
Logged In: YES 
user_id=89016

OK, now I'm beginning to understand the docstring.
Nevertheless I think having a class that uses stateful
codecs at both ends would be useful. If you want, I can give
this a try (after I'm back from vactation in four weeks).

Closing the report as fixed.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2005-09-01 20:28

Message:
Logged In: YES 
user_id=38388

Thanks, Walter.

StreamRecorder is not broken: it works as advertised (see
the .__init__() doc-string and interface) and yes, this
means that only stateless encodings can be used, such as
e.g. UTF-16-LE, simply because the encode and decode
functions are defined as being stateless.




----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2005-09-01 14:22

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/codecs.py 1.48/1.35.2.10

I'll try to add tests for StreamRecoder tomorrow.

StreamRecoder is broken in its current form, as it uses the
stateless codec for the frontend encoding. Recoding from
e.g. latin-1 to utf-16 will return a BOM for every call to
read() which is clearly wrong. What gets read from the
backend stream should be pushed through a *stateful*
encoder. BTW, a feed style API would help here ;)


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2005-08-31 22:58

Message:
Logged In: YES 
user_id=38388

Looks good, Walter.

Please check it in.

Thanks.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2005-08-31 18:11

Message:
Logged In: YES 
user_id=89016

Here's a simple patch

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1235646&group_id=5470


More information about the Python-bugs-list mailing list