[Python-Dev] Decoding incomplete unicode

"Martin v. Löwis" martin at v.loewis.de
Tue Aug 24 22:38:39 CEST 2004


M.-A. Lemburg wrote:
> However, I don't like the way that the patch
> implements this state handling: I think we should use a
> generic "state" object here which is passed to the stateful
> codec and returned together with the standard return values
> on output:

Why is that better? Practicality beats purity.
This is useless over-generalization.

> Otherwise we'll end up having different interface
> signatures for all codecs and extending them to accomodate
> for future enhancement will become unfeasable without
> introducing yet another set of APIs.

What is "a codec" here? A class implementing the StreamReader
and/or Codec interface? Walter's patch does not change the
API of any of these. It just adds a few functions to some
module, which are not meant to be called directly.

> If we leave out the UTF-7 codec changes in the
> patch, the only state that the UTF-8 and UTF-16
> codecs create is the number of bytes consumed. We already
> have the required state parameter for this in the
> standard decode API, so no extra APIs are needed for
> these two codecs.

Where precisely is the number of decoded bytes in the API?

Regards,
Martin



More information about the Python-Dev mailing list