Re: [Python-Dev] PEP 467: last round (?)

Sept. 4, 2016

      On 4 September 2016 at 20:43, Koos Zevenhoven <k7hoven@gmail.com> wrote:
...
On Sun, Sep 4, 2016 at 12:51 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
...
That said, the PEP does propose "getbyte()" and "iterbytes()" for
bytes-oriented indexing and iteration, so there's a reasonable
consistency argument in favour of also proposing "byte" as the builtin
factory function:
* data.getbyte(idx) would be a more efficient alternative to byte(data[idx])
* data.iterbytes() would be a more efficient alternative to map(byte, data)
.. I don't understand the argument for having 'byte' in these names.
They should have 'char' or 'chr' in them for exacly the same reason
that the proposed builtin should have 'chr' in it instead of 'byte'.
If 'bytes' is an iterable of ints, then get_byte should probably
return an int
I'm sorry, but this argument comes across as "were're proposing the
wrong thing here, so for consistency, we might want to do the wrong
thing in this other part too".
There are two self-consistent sets of names:

    bchr
    bytes.getbchr, bytearray.getbchr
    bytes.iterbchr, bytearray.iterbchr

    byte
    bytes.getbyte, bytearray.getbyte
    bytes.iterbytes, bytearray.iterbytes

The former set emphasises the "stringiness" of this behaviour, by
aligning with the chr() builtin

The latter set emphasises that these APIs are still about working with
arbitrary binary data rather than text, with a Python "byte"
subsequently being a length 1 bytes object containing a single integer
between 0 and 255, rather than "What you get when you index or iterate
over a bytes instance".

Having noticed the discrepancy, my personal preference is to go with
the latter option (since it better fits the "executable pseudocode"
ideal and despite my reservations about "bytes objects contain int
objects rather than byte objects", that shouldn't be any more
confusing in the long run than explaining that str instances are
containers of length-1 str instances). The fact "byte" is much easier
to pronounce than bchr (bee-cher? bee-char?) also doesn't hurt.

However, I suspect we'll need to put both sets of names in front of
Guido and ask him to just pick whichever he prefers to get it resolved
one way or the other.
...
And didn't someone recently propose deprecating iterability of str
(not indexing, or slicing, just iterability)? Then str would also need
a way to provide an iterable or sequence view of the characters. For
consistency, the str functionality would probably need to mimic the
approach in bytes. IOW, this PEP may in fact ultimately dictate how to
get a iterable/sequence from a str object.
Strings are not going to become atomic objects, no matter how many
times people suggest it.
...
...
With bchr, those mappings aren't as clear (plus there's a potentially
unwanted "text" connotation arising from the use of the "chr"
abbreviation).
Which mappings?
The mapping between the builtin name and the method names.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia