On 4 September 2016 at 20:43, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Sun, Sep 4, 2016 at 12:51 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That said, the PEP does propose "getbyte()" and "iterbytes()" for bytes-oriented indexing and iteration, so there's a reasonable consistency argument in favour of also proposing "byte" as the builtin factory function:
* data.getbyte(idx) would be a more efficient alternative to byte(data[idx]) * data.iterbytes() would be a more efficient alternative to map(byte, data)
.. I don't understand the argument for having 'byte' in these names. They should have 'char' or 'chr' in them for exacly the same reason that the proposed builtin should have 'chr' in it instead of 'byte'. If 'bytes' is an iterable of ints, then get_byte should probably return an int
I'm sorry, but this argument comes across as "were're proposing the wrong thing here, so for consistency, we might want to do the wrong thing in this other part too".
There are two self-consistent sets of names: bchr bytes.getbchr, bytearray.getbchr bytes.iterbchr, bytearray.iterbchr byte bytes.getbyte, bytearray.getbyte bytes.iterbytes, bytearray.iterbytes The former set emphasises the "stringiness" of this behaviour, by aligning with the chr() builtin The latter set emphasises that these APIs are still about working with arbitrary binary data rather than text, with a Python "byte" subsequently being a length 1 bytes object containing a single integer between 0 and 255, rather than "What you get when you index or iterate over a bytes instance". Having noticed the discrepancy, my personal preference is to go with the latter option (since it better fits the "executable pseudocode" ideal and despite my reservations about "bytes objects contain int objects rather than byte objects", that shouldn't be any more confusing in the long run than explaining that str instances are containers of length-1 str instances). The fact "byte" is much easier to pronounce than bchr (bee-cher? bee-char?) also doesn't hurt. However, I suspect we'll need to put both sets of names in front of Guido and ask him to just pick whichever he prefers to get it resolved one way or the other.
And didn't someone recently propose deprecating iterability of str (not indexing, or slicing, just iterability)? Then str would also need a way to provide an iterable or sequence view of the characters. For consistency, the str functionality would probably need to mimic the approach in bytes. IOW, this PEP may in fact ultimately dictate how to get a iterable/sequence from a str object.
Strings are not going to become atomic objects, no matter how many times people suggest it.
With bchr, those mappings aren't as clear (plus there's a potentially unwanted "text" connotation arising from the use of the "chr" abbreviation).
Which mappings?
The mapping between the builtin name and the method names. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia