[Python-Dev] Byte literals (was Re: [Python-checkins] Changing string constants to byte arrays ( r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py ))
Steve Holden
steve at holdenweb.com
Tue May 8 18:21:46 CEST 2007
Guido van Rossum wrote:
> [+python-3000; replies please remove python-dev]
>
> On 5/5/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>> "Fred L. Drake, Jr." <fdrake at acm.org> wrote:
>>> On Saturday 05 May 2007, Aahz wrote:
>>> > I'm with MAL and Fred on making literals immutable -- that's safe and
>>> > lots of newbies will need to use byte literals early in their Python
>>> > experience if they pick up Python to operate on network data.
>>>
>>> Yes; there are lots of places where bytes literals will be used the way str
>>> literals are today. buffer(b'...') might be good enough, but it seems more
>>> than a little idiomatic, and doesn't seem particularly readable.
>>>
>>> I'm not suggesting that /all/ literals result in constants, but bytes literals
>>> seem like a case where what's wanted is the value. If b'...' results in a
>>> new object on every reference, that's a lot of overhead for a network
>>> protocol implementation, where the data is just going to be written to a
>>> socket or concatenated with other data. An immutable bytes type would be
>>> very useful as a dictionary key as well, and more space-efficient than
>>> tuple(b'...').
>> I was saying the exact same thing last summer. See my discussion with
>> Martin about parsing/unmarshaling. What I expect will happen with bytes
>> as dictionary keys is that people will end up subclassing dictionaries
>> (with varying amounts of success and correctness) to do something like
>> the following...
>>
>> class bytesKeys(dict):
>> ...
>> def __setitem__(self, key, value):
>> if isinstance(key, bytes):
>> key = key.decode('latin-1')
>> else:
>> raise KeyError("only bytes can be used as keys")
>> dict.__setitem__(self, key, value)
>> ...
>>
>> Is it optimal? No. Would it be nice to have immtable bytes? Yes. Do
>> I think it will really be a problem in parsing/unmarshaling? I don't
>> know, but the fact that there now exists a reasonable literal syntax b'...'
>> rather than the previous bytes([1, 2, 3, ...]) means that we are coming
>> much closer to having what really is about the best way to handle this;
>> Python 2.x str.
>
> I don't know how this will work out yet. I'm not convinced that having
> both mutable and immutable bytes is the right thing to do; but I'm
> also not convinced of the opposite. I am slowly working on the
> string/unicode unification, and so far, unfortunately, it is quite
> daunting to get rid of 8-bit strings even at the Python level let
> alone at the C level.
>
> I suggest that the following exercise, to be carried out in the
> py3k-struni branch, might be helpful: (1) change the socket module to
> return bytes instead of strings (it already takes bytes, by virtue of
> the buffer protocol); (2) change its makefile() method so that it uses
> the new io.py library, in particular the SocketIO wrapper there; (3)
> fix up the httplib module and perhaps other similar ones. Take copious
> notes while doing this. Anyone up for this? I will listen! (I'd do it
> myself but I don't know where I'd find the time).
>
I'm having a hard time understanding why bytes literals would be a good
thing. OK, displays require the work of creating a new object (since
bytes types will be mutable) but surely a mutable literal is always
going to make programs potentially hard to read.
If you want a representation of a bytes object in your program text
doesn't that always (like other mutable types) have to represent the
same value, creating new objects as necessary if previously-created
objects could have been mutated.
What am I missing here?
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
------------------ Asciimercial ---------------------
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.com squidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-------------- Thank You for Reading ----------------
More information about the Python-Dev
mailing list