[Python-ideas] Python 3.x and bytes

Ethan Furman ethan at stoneleaf.us
Wed May 18 22:10:15 CEST 2011


As those who have to work with byte strings know, when retrieving a 
single character from a byte string, what you get back is not a byte 
string, but an int -- a rather important distinction from unicode 
strings (str).

This has the frustrating side-effect of

b'abc'[2] == b'c'

being False.

It is far too late to change that particular behavior of the byte string 
(returning int's, that is) -- however, it may not be too late for a 
non-backwards-incompatible change:

have the bytes class' __eq__ method be modified so that it
    1) checks to see if the bytes instance is length 1
    2) checks to see if
       a) the other object is an int, and
       b) 0 <= other_obj < 256
    3) if 1 and 2, make the comparison between the int and its
         single element instead of returning NotImplemented?

This makes sense to me -- after all, the bytes class is an array of ints
in range(256);  it is a special case, but doesn't feel any more special
than passing an int into bytes() giving a string of that many null
bytes; and it would get rid of the, in my opinion ugly, idiom of

some_var[i:i+1] == b'd'

It would also not require a new literal syntax.

Thoughts?

~Ethan~




More information about the Python-ideas mailing list