[Python-Dev] Python 3.x and bytes

Ethan Furman ethan at stoneleaf.us
Thu May 19 19:50:10 CEST 2011

Nick Coghlan wrote:
> OK, summarising the thread so far from my point of view. 


> To be honest, I don't think there is a lot we can do here except to
> further emphasise in the documentation and elsewhere that *bytes is
> not a string type* (regardless of any API similarities retained to
> ease transition from the 2.x series). For example, if we have any
> lingering references to "byte strings" they should be replaced with
> "byte sequences" or "bytes objects" (depending on context, as the
> former phrasing also encompasses bytearray objects).

I think this would be a big help.

> 2. As a concrete usability issue, it is awkward to programmatically
> check the value of a specific byte when working with an ASCII based
> protocol:
>   data[i] == b'a' # Intuitive, but always False due to type mismatch
>   data[i:i+1] == b'a'  # Works, but clumsy
>   data[i] == b'a'[0]  # Ditto (but at least susceptible to compiler
> const-expression optimisation)
>   data[i] == ord('a') # Clumsy and slow
>   data[i] == 97 # Hard to read
> Proposals to address this include:
> - introduce a "character" literal to allow c'a' as an alternative to ord('a')
>     Potentially workable, but leaves the intuitive answer above
>     silently producing an unexpected answer


> For point 2, I'm personally +0 on the idea of having 1-element bytes
> and bytearray objects delegate hashing and comparison operations to
> the corresponding integer object. We have the power to make the
> obvious code correct code, so let's do that. However, the implications
> of the additional key collisions in value based containers may need to
> be explored further.

Nick Coghlan also wrote:
 > On further reflection, the key collision and semantics blurring
 > problems mean I am at best -0 on this particular solution to the
 > problem (and heading fairly rapidly in the direction of -1).

Last thought I have for a possible 'solution' -- when a bytes object is 
tested for equality against an int raise TypeError.  Precedent being 
sum() raising a TypeError when passed a list of strings because 
performance is so poor.  Reason here being that the intuitive behavior 
will never work and will always produce silent bugs.


More information about the Python-Dev mailing list