[Python-ideas] A possible transition plan to bytes-based iteration and indexing for binary data

Sun Jun 15 17:24:29 CEST 2014

On Sun, Jun 15, 2014 at 10:33:14PM +1000, Nick Coghlan wrote:
> At PyCon earlier this year, Guido (and others) persuaded me that the
> integer based indexing and iteration for bytes and bytearray in Python
> 3 was a genuine design mistake based on the initial Python 3 design
> which lacked an immutable bytes type entirely (so producing integers
> was originally the only reasonable choice).
[...]
> The general principle involved would be to return an integer *subtype*

Have you considered subclassing bytes, rather than int?

for i in b"foo":
    assert isinstance(i, int)
for b in sensible_bytes(b"foo"):
    assert isinstance(b, bytes)

I'm not wedded to the name :-)

And then, perhaps some time in the distant future when porting 
from Python 2.7 is no longer a priority, we can add

from __future__ import bytes_iteration_yields_bytes

There's at least two obvious downsides: the b'' syntax will still refer 
to the less useful type, and it will be a violation of the Liskov 
substitution principle (but then I've always considered that to be a 
guideline rather than a hard law).

> It wouldn't be pretty, and it would be a pain to document, but it
> seems feasible. The alternative is for PEP 367 to add a separate bytes
> iteration method, which strikes me as further entrenching a design we
> aren't currently happy with.

Unless you have a strategy to deprecate *and remove* the magic int 
subclass some time in the foreseeable future, you're still entrenching 
the design. I think whatever we do, we're going to end up with something 
ugly in the language. Possibly the least ugly, and certainly the least 
magic, is a separate bytes iteration method.

Keeping-an-open-mind-but-leaning-towards-minus-one-on-the-idea-ly y'rs,

-- 
Steven