On 2008-06-12 16:59, Walter Dörwald wrote:
M.-A. Lemburg wrote:
.transform() and .untransform() use the codecs to apply same-type conversions. They do apply type checks to make sure that the codec does indeed return the same type.
E.g. text.transform('xml-escape') or data.transform('base64').
So what would a base64 codec do with the errors argument?
It could use it to e.g. try to recover as much data as possible from broken input data.
Currently (in Py2.x), it raises an exception if you pass in anything but "strict".
I think for transformations we don't need the full codec machinery: ...
No need to invent another wheel :-) The codecs already exist for Py2.x and can be used by the .encode()/.decode() methods in Py2.x (where no type checks occur).
By using a new API we could get rid of old warts. For example: Why does the stateless encoder/decoder return how many input characters/bytes it has consumed? It must consume *all* bytes anyway!
No, it doesn't and that's the point in having those return values :-)
Even though the encoder/decoders are stateless, that doesn't mean they have to consume all input data. The caller is responsible to make sure that all input data was in fact consumed.
You could for example have a decoder that stops decoding after having seen a block end indicator, e.g. a base64 line end or XML closing element.
Just because all codecs that ship with Python always try to decode the complete input doesn't mean that the feature isn't being used. The interface was designed to allow for the above situations.