On Tue, Aug 19, 2014 at 5:25 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 18 August 2014 10:45, Guido van Rossum <guido@python.org> wrote:
On Sun, Aug 17, 2014 at 5:22 PM, Barry Warsaw <barry@python.org> wrote:
On Aug 18, 2014, at 10:08 AM, Nick Coghlan wrote:
There's actually another aspect to your idea, independent of the
naming:
exposing a view rather than just an iterator. I'm going to have to look at the implications for memoryview, but it may be a good way to go (and would align with the iterator -> view changes in dict).
Yep! Maybe that will inspire a better spelling. :)
+1. It's just as much about b[i] as it is about "for c in b", so a view sounds right. (The view would have to be mutable for bytearrays and for writable memoryviews.)
On the rest, it's sounding more and more as if we will just need to live with both bytes(1000) and bytearray(1000). A warning sounds worse than a deprecation to me.
I'm fine with keeping bytearray(1000), since that works the same way in both Python 2 & 3, and doesn't seem likely to be invoked inadvertently.
I'd still like to deprecate "bytes(1000)", since that does different things in Python 2 & 3, while "b'\x00' * 1000" does the same thing in both.
I think any argument based on what "bytes" does in Python 2 is pretty weak, since Python 2's bytes is just an alias for str, so it has tons of behavior that differ -- why single this out? In Python 3, I really like bytes and bytearray to be as similar as possible, and that includes the constructor.
$ python -c 'print("{!r}\n{!r}".format(bytes(10), b"\x00" * 10))' '10' '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' $ python3 -c 'print("{!r}\n{!r}".format(bytes(10), b"\x00" * 10))' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Hitting the deprecation warning in single-source code would seem to be a strong hint that you have a bug in one version or the other rather than being intended behaviour.
bytes.zeros(n) sounds fine to me; I value similar interfaces for bytes and bytearray pretty highly.
With "bytearray(1000)" sticking around indefinitely, I'm less concerned about adding a "zeros" constructor.
That's fine.
I'm lukewarm on bytes.byte(c); but bytes([c]) does bother me because a size one list is (or at least feels) more expensive to allocate than a size one bytes object. So, okay.
So, here's an interesting thing I hadn't previously registered: we actually already have a fairly capable "bytesview" option, and have done since Stefan implemented "memoryview.cast" in 3.3. The trick lies in the 'c' format character for the struct module, which is parsed as a length 1 bytes object rather than as an integer:
data = bytearray(b"Hello world") bytesview = memoryview(data).cast('c') list(bytesview) [b'H', b'e', b'l', b'l', b'o', b' ', b'w', b'o', b'r', b'l', b'd'] b''.join(bytesview) b'Hello world' bytesview[0:5] = memoryview(b"olleH").cast('c') list(bytesview) [b'o', b'l', b'l', b'e', b'H', b' ', b'w', b'o', b'r', b'l', b'd'] b''.join(bytesview) b'olleH world'
For the read-only case, it covers everything (iteration, indexing, slicing), for the writable view case, it doesn't cover changing the shape of the target array, and it doesn't cover assigning arbitrary buffer objects (you need to wrap them in a similar cast for memoryview to allow the assignment).
It's hardly the most *intuitive* spelling though - I was one of the reviewers for Stefan's memoryview rewrite back in 3.3, and I only made the connection today when looking to see how a view object like the one we were discussing elsewhere in the thread might be implemented as a facade over arbitrary memory buffers, rather than being specific to bytes and bytearray.
Maybe the 'future' package can offer an iterbytes or bytesview implemented this way?
If we went down the "bytesview" path, then a single new facade would cover not only the 3 builtins (bytes, bytearray, memoryview) but also any *other* buffer exporting type. If we so chose (at some point in the future, not as part of this PEP), such a type could allow additional bytes operations (like "count", "startswith" or "index") to be applied to arbitrary regions of memory without making a copy.
Why call out "without making a copy" for operations that naturally don't have to copy anything?
We can't add those other operations to memoryview, since they don't make sense for an n-dimensional array.
I'm sorry for your efforts, but I'm getting more and more lukewarm about the entire PEP. -- --Guido van Rossum (python.org/~guido)