[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.

Bill Janssen janssen at parc.com
Tue May 31 18:16:46 CEST 2011


Nick Coghlan <ncoghlan at gmail.com> wrote:

> Perhaps it is time to resurrect the idea of an explicit 'ascii' type?
> Add a'' literals, support the full string API as well as the bytes
> API, deprecate all string APIs on bytes and bytearray objects. The
> other thing I have learned in trying to deal with some of these issues
> is that ASCII-encoded text really *is* special, compared to all other
> encodings, due to its widespread use in a multitude of networking
> protocols and other formats.

I like the deprecations you suggest, but I'd prefer to see a more
general solution: the 'str' type extended so that it had two possible
representations for strings, the current format and an "encoded" format,
which would be kept as an array of bytes plus an encoding.  It would
transcode only as necessary -- for example, the 're' module might
require the current Unicode encoding.  An explicit method would be added
to allow the user to force transcoding.

This would complicate life at the C level, to be sure.  Though, perhaps
not so much, given the proper macrology.

Bill



More information about the Python-ideas mailing list