M.-A. Lemburg wrote:
Walter Dörwald wrote:
Guido van Rossum wrote:
[...] Years ago I wrote a prototype; checkout sandbox/sio/. However sio.DecodingInputFilter and sio.EncodingOutputFilter don't work for encodings that need state (e.g. when reading/writing UTF-16). Switching to stateful encoders/decoders isn't so easy, because the stateful codecs require a stream-API, which brings in a whole bunch of other functionality (readline() etc.), which we'd probably like to keep separate. I have a patch (http://bugs.python.org/1101097) that should fix this problem (at least for all codecs derived from codecs.StreamReader/codecs.StreamWriter). Additionally it would make stateful codecs more useful in the context for iterators/generators.
I'd like this patch to go into 2.5.
The patch as-is won't go into 2.5. It's simply the wrong approach: StreamReaders and -Writers work on streams (hence the name). It doesn't make sense adding functionality to side-step this behavior, since it undermines the design.
I agree that using a StreamWriter without a stream somehow feels wrong.
Like I suggested in the patch discussion, such functionality could be factored out of the implementations of StreamReaders/Writers and put into new StatefulEncoder/Decoder classes, the objects of which then get used by StreamReader/Writer.
In addition to that we could extend the codec registry to also maintain slots for the stateful encoders and decoders, if needed.
We *have* to do it like this otherwise there would be no way to get a StatefulEncoder/Decoder from an encoding name. Does this mean that codecs.lookup() would have to return a 6-tuple? But this would break if someone uses codecs.lookup("foo")[-1]. So maybe codecs.lookup() should return an instance of a subclass of tuple which has the StatefulEncoder/Decoder as attributes. But then codecs.lookup() must be able to handle old 4-tuples returned by old search functions and update those to the new 6-tuples. (But we could drop this again after several releases, once all third party codecs are updated). Bye, Walter Dörwald