[Python-Dev] Stateful codecs [Was: str object going in Py3K]
Walter Dörwald
walter at livinglogic.de
Fri Feb 17 15:38:24 CET 2006
M.-A. Lemburg wrote:
> Walter Dörwald wrote:
>> Guido van Rossum wrote:
>>
>>> [...]
>>> Years ago I wrote a prototype; checkout sandbox/sio/.
>> However sio.DecodingInputFilter and sio.EncodingOutputFilter don't work
>> for encodings that need state (e.g. when reading/writing UTF-16).
>> Switching to stateful encoders/decoders isn't so easy, because the
>> stateful codecs require a stream-API, which brings in a whole bunch of
>> other functionality (readline() etc.), which we'd probably like to keep
>> separate. I have a patch (http://bugs.python.org/1101097) that should
>> fix this problem (at least for all codecs derived from
>> codecs.StreamReader/codecs.StreamWriter). Additionally it would make
>> stateful codecs more useful in the context for iterators/generators.
>>
>> I'd like this patch to go into 2.5.
>
> The patch as-is won't go into 2.5. It's simply the wrong approach:
> StreamReaders and -Writers work on streams (hence the name). It
> doesn't make sense adding functionality to side-step this behavior,
> since it undermines the design.
I agree that using a StreamWriter without a stream somehow feels wrong.
> Like I suggested in the patch discussion, such functionality could
> be factored out of the implementations of StreamReaders/Writers
> and put into new StatefulEncoder/Decoder classes, the objects of
> which then get used by StreamReader/Writer.
>
> In addition to that we could extend the codec registry to also
> maintain slots for the stateful encoders and decoders, if needed.
We *have* to do it like this otherwise there would be no way to get a
StatefulEncoder/Decoder from an encoding name.
Does this mean that codecs.lookup() would have to return a 6-tuple? But
this would break if someone uses codecs.lookup("foo")[-1]. So maybe
codecs.lookup() should return an instance of a subclass of tuple which
has the StatefulEncoder/Decoder as attributes. But then codecs.lookup()
must be able to handle old 4-tuples returned by old search functions and
update those to the new 6-tuples. (But we could drop this again after
several releases, once all third party codecs are updated).
Bye,
Walter Dörwald
More information about the Python-Dev
mailing list