[Python-Dev] Python 3.x and bytes
ethan at stoneleaf.us
Thu May 19 19:50:10 CEST 2011
Nick Coghlan wrote:
> OK, summarising the thread so far from my point of view.
> To be honest, I don't think there is a lot we can do here except to
> further emphasise in the documentation and elsewhere that *bytes is
> not a string type* (regardless of any API similarities retained to
> ease transition from the 2.x series). For example, if we have any
> lingering references to "byte strings" they should be replaced with
> "byte sequences" or "bytes objects" (depending on context, as the
> former phrasing also encompasses bytearray objects).
I think this would be a big help.
> 2. As a concrete usability issue, it is awkward to programmatically
> check the value of a specific byte when working with an ASCII based
> data[i] == b'a' # Intuitive, but always False due to type mismatch
> data[i:i+1] == b'a' # Works, but clumsy
> data[i] == b'a' # Ditto (but at least susceptible to compiler
> const-expression optimisation)
> data[i] == ord('a') # Clumsy and slow
> data[i] == 97 # Hard to read
> Proposals to address this include:
> - introduce a "character" literal to allow c'a' as an alternative to ord('a')
> Potentially workable, but leaves the intuitive answer above
> silently producing an unexpected answer
> For point 2, I'm personally +0 on the idea of having 1-element bytes
> and bytearray objects delegate hashing and comparison operations to
> the corresponding integer object. We have the power to make the
> obvious code correct code, so let's do that. However, the implications
> of the additional key collisions in value based containers may need to
> be explored further.
Nick Coghlan also wrote:
> On further reflection, the key collision and semantics blurring
> problems mean I am at best -0 on this particular solution to the
> problem (and heading fairly rapidly in the direction of -1).
Last thought I have for a possible 'solution' -- when a bytes object is
tested for equality against an int raise TypeError. Precedent being
sum() raising a TypeError when passed a list of strings because
performance is so poor. Reason here being that the intuitive behavior
will never work and will always produce silent bugs.
More information about the Python-Dev