[Python-Dev] transform() and untransform() methods, and the codec registry
victor.stinner at haypocalc.com
Fri Dec 3 10:16:04 CET 2010
On Thursday 02 December 2010 19:06:51 georg.brandl wrote:
> Author: georg.brandl
> Date: Thu Dec 2 19:06:51 2010
> New Revision: 86934
> #7475: add (un)transform method to bytes/bytearray and str, add back codecs
> that can be used with them from Python 2.
Oh no, someone did it. Was it really needed to reintroduce rot13 and friends?
I'm not strongly opposed to .transform()/.untranform() if it can be complelty
separated to text encodings (ascii, latin9, utf-8 & cie). But str.encode() and
bytes.decode() do accept transform codec names and raise strange error
messages. Quote of Martin von Löwis (#7475):
"If the codecs are restored, one half of them becomes available to
.encode/.decode methods, since the codec registry cannot tell which
ones implement real character encodings, and which ones are other
conversion methods. So adding them would be really confusing."
TypeError: 'str' does not support the buffer interface
TypeError: expected an object with the buffer interface
TypeError: decoder did not return a str object (type=bytes)
TypeError: encoder did not return a bytes object (type=str)
I don't like transform() and untransform() because I think that we should not
add too much operations to the base types (bytes and str), and they do
implicit module import. I prefer explicit module import (eg. import binascii;
binascii.hexlify(b'to hex')). It remembers me PHP and it's ugly namespace with
+5000 functions. I prefer Python because it uses smaller and more namespaces
which are more specific and well defined. If we add email and compression
functions to bytes, why not adding a web browser to the str?
More information about the Python-Dev