Re: [Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.

27 May 2011

      On Sat, May 28, 2011 at 10:55 AM, Greg Ewing
<greg.ewing@canterbury.ac.nz> wrote:
...
Nick Coghlan wrote:
...
The pedagogic cost of making it even harder than it already is to
convince people that bytes are not text would also need to be
considered.
I think that boat was missed some time ago. If there were
ever a serious intention to teach people that bytes are not
text by limiting the feature set of bytes, it would have
been better served by not giving bytes *any* features that
assumed a particular encoding.
As it is, bytes has quite a lot of features that implicitly
treat it as ascii-encoded text: the literal and repr()
forms, capitalize(), expandtabs(), lower(), splitlines(),
swapcase(), title(), upper(), and all the is*() methods.
Accepting all of that, and then saying "Oh, no, we couldn't
possibly provide a format() method, because bytes are not
text" seems a tad inconsistent.
Originally we didn't have all of that - more and more of it crept back
in at the behest of several binary protocol folks (including me, if I
recall correctly).

The urllib.parse experience has convinced me that giving in to that
pressure was a mistake. We went for a premature optimisation, and
screwed up the bytes API as a result. Yes, there is a potential
performance issue with the decode/process/encode model, but simple
keeping a bunch of string methods in the bytes API was the wrong
answer (and something that isn't actually all that useful in practice,
for the reasons brought up in this and other recent threads).

Perhaps it is time to resurrect the idea of an explicit 'ascii' type?
Add a'' literals, support the full string API as well as the bytes
API, deprecate all string APIs on bytes and bytearray objects. The
other thing I have learned in trying to deal with some of these issues
is that ASCII-encoded text really *is* special, compared to all other
encodings, due to its widespread use in a multitude of networking
protocols and other formats.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia