[Python-Dev] PEP 467: last round (?)

Mon Sep 5 12:58:42 EDT 2016

On 09/03/2016 09:48 AM, Nick Coghlan wrote:
> On 3 September 2016 at 21:35, Martin Panter wrote:
>> On 3 September 2016 at 08:47, Victor Stinner wrote:
>>> Le samedi 3 septembre 2016, Random832 a écrit :
>>>> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote:

>>>>> The problem with only having `bchr` is that it doesn't help with
>>>>> `bytearray`;
>>>>
>>>> What is the use case for bytearray.fromord? Even in the rare case
>>>> someone needs it, why not bytearray(bchr(...))?
>>>
>>> Yes, this was my point: I don't think that we need a bytearray method to
>>> create a mutable string from a single byte.
>>
>> I agree with the above. Having an easy way to turn an int into a bytes
>> object is good. But I think the built-in bchr() function on its own is
>> enough. Just like we have bytes object literals, but the closest we
>> have for a bytearray literal is bytearray(b". . .").
>
> This is a good point - earlier versions of the PEP didn't include
> bchr(), they just had the class methods, so "bytearray(bchr(...))"
> wasn't an available spelling (if I remember the original API design
> correctly, it would have been something like
> "bytearray(bytes.byte(...))"), which meant there was a strong
> consistency argument in having the alternate constructor on both
> types. Now that the PEP proposes the "bchr" builtin, the "fromord"
> constructors look less necessary.

tl;dr -- Sounds good to me.  I'll update the PEP.

-------

When this started the idea behind the methods that eventually came to be
called "fromord" and "fromsize" was that they would be the two possible
interpretations of "bytes(x)":

   the legacy Python2 behavior:

     >>> var = bytes('abc')
     >>> bytes(var[1])
     'b'

   the current Python 3 behavior:

     >>> var = b'abc'
     >>> bytes(var[1])
     b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
       \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
       \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
       \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
       \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
       \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
       \x00\x00'

Digging deeper the problem turns out to be that indexing a bytes object
changed:

   Python 2:

     >>> b'abc'[1]
     'b'

   Python 3:

     >>> b'abc'[1]
     98

If we pass an actual byte into the Python 3 bytes constructor it behaves
as one would expect:

     >>> bytes(b'b')
     b'b'

Given all this it can be argued that the real problem is that indexing a
bytes object behaves differently depending on whether you retrieve a single
byte with an index versus a single byte with a slice:

     >>> b'abc'[2]
     99

     >>> b'abc'[2:]
     b'c'

Since we cannot fix that behavior, the question is how do we make it more
livable?

- we can add a built-in to transform the int back into a byte:

   >>> bchr(b'abc'[2])
   b'c'

- we can add a method to return a byte from the bytes object, not an int:

   >>> b'abc'.getbyte(2)
   b'c'

- we can add a method to return a byte from an int:

   >>> bytes.fromint(b'abc'[2])
   b'c'

Which is all to say we have two problems to deal with:

- getting bytes from a bytes object
- getting bytes from an int

Since "bytes.fromint()" and "bchr()" are the same, and given that
"bchr(ordinal)" mirrors "chr(ordinal)", I think "bchr" is the better
choice for getting bytes from an int.

For getting bytes from bytes, "getbyte()" and "iterbytes" are good choices.

> Given that, and the uncertain deprecation time frame for accepting
> integers in the main bytes and bytearray constructors, perhaps both
> the "fromsize" and "fromord" parts of the proposal can be deferred
> indefinitely in favour of just adding the bchr() builtin?

Agreed.

--
~Ethan~