[Python-Dev] Decoding incomplete unicode

"Martin v. Löwis" martin at v.loewis.de
Wed Aug 18 22:51:57 CEST 2004


Walter Dörwald wrote:
> They will not, because StreamReader.decode() already is a feed
> style API (but with state amnesia).
> 
> Any stream decoder that I can think of can be (and most are)
> implemented by overwriting decode().

I consider that an unfortunate implementation artefact. You
either use the stateless encode/decode that you get from
codecs.get(encoder/decoder) or you use the file API on
the streams. You never ever use encode/decode on streams.

I would have preferred if the default .write implementation
would have called self._internal_encode, and the Writer
would *contain* a Codec, rather than inheriting from Codec.
Alas, for (I guess) simplicity, a more direct (and more
confusing) approach was taken.

> 1) Having feed() as part of the StreamReader API:
> ---
> s = u"???".encode("utf-8")
> r = codecs.getreader("utf-8")()
> for c in s:
>    print r.feed(c)

Isn't that a totally unrelated issue? Aren't we talking about
short reads on sockets etc?

I would very much prefer to solve one problem at a time.

Regards,
Martin


More information about the Python-Dev mailing list