Bitwise operations on bytes class

I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray). I can't think of any other reasonable use for these operators. Is upstream Python interested in this kind of behavior by default? At the least, it would make many algorithms very easy to read and write. Nathaniel

On 06/16/2014 11:03 AM, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray). I can't think of any other reasonable use for these operators. Is upstream Python interested in this kind of behavior by default? At the least, it would make many algorithms very easy to read and write.
Could you give a couple examples? -- ~Ethan~

On 6/16/2014 2:03 PM, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray).
If you are often doing and/or/xor on large arrays, as one might do for bitmap images, you should probably be using numpy or a derivative thereof. What use do you have for shifting bits across byte boundaries, where the bytes are really bytes? Why would you not turn multiple bytes considered together into an int?
I can't think of any other reasonable use for these operators.
I don't understand this. They are routinely used on ints for various purposes. -- Terry Jan Reedy

On Mon, 2014-06-16 at 15:20 -0400, Terry Reedy wrote:
On 6/16/2014 2:03 PM, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray).
If you are often doing and/or/xor on large arrays, as one might do for bitmap images, you should probably be using numpy or a derivative thereof.
What use do you have for shifting bits across byte boundaries, where the bytes are really bytes? Why would you not turn multiple bytes considered together into an int?
There are many reasons. Anything relating to cryptography, key derivation, asn1 BitString, etc. Many network protocols have specialized algorithms which require bit rotations or bitwise operations on blocks.
I can't think of any other reasonable use for these operators.
I don't understand this. They are routinely used on ints for various purposes.
I meant that, for instance, I can't think of any other reasonable interpretation for what "bytes() ^ bytes()" would mean other than a bitwise xor of the bytes in the arrays. Yes, of course the operators have meanings in other contexts. But in this context, I think the meaning of the operators is self-evident and precise in meaning. Perhaps some code will clarify what I'm proposing. Attached is a class I have found continual reuse for over the last few years. It implements bitwise operators on a bytes subclass. Something similar could be done for bytearray. Nathaniel

Nathaniel McCallum, 16.06.2014 21:43:
Perhaps some code will clarify what I'm proposing. Attached is a class I have found continual reuse for over the last few years. It implements bitwise operators on a bytes subclass. Something similar could be done for bytearray.
Ok, according to your code, you don't want a SIMD type but rather an arbitrary size integer type. Why don't you just use the "int" ("long" in Py2) type for that? It has way faster operations than your multiple copy implementation. Stefan

On Mon, 2014-06-16 at 21:55 +0200, Stefan Behnel wrote:
Nathaniel McCallum, 16.06.2014 21:43:
Perhaps some code will clarify what I'm proposing. Attached is a class I have found continual reuse for over the last few years. It implements bitwise operators on a bytes subclass. Something similar could be done for bytearray.
Ok, according to your code, you don't want a SIMD type but rather an arbitrary size integer type. Why don't you just use the "int" ("long" in Py2) type for that? It has way faster operations than your multiple copy implementation.
Of course my attached code is slow. This is precisely why I'm proposing native additions to the bytes class. However, in most algorithms, there is a single operation like this on a block of data which is otherwise not treated as an integer. This operation often takes the form of something like: blocks.append(blocks[-1] ^ block) In all the surrounding code, you are dealing with bytes *as* bytes. Converting into alternate types breaks up the readability of the algorithm. And given the security requirements of such algorithms, readability is extremely important. The above code example has both simplicity and obviousness. Currently, in py3k, this is AFAICS the best alternative for readability: blocks.append([a ^ b for a, b in zip(blocks[-1], block)] While this is infinitely better than Python 2.x, I think my proposal is still significantly more readable. When implemented natively, my proposal is also far more performant than this. Nathaniel

On Mon, 2014-06-16 at 16:16 -0400, Nathaniel McCallum wrote:
On Mon, 2014-06-16 at 21:55 +0200, Stefan Behnel wrote:
Nathaniel McCallum, 16.06.2014 21:43:
Perhaps some code will clarify what I'm proposing. Attached is a class I have found continual reuse for over the last few years. It implements bitwise operators on a bytes subclass. Something similar could be done for bytearray.
Ok, according to your code, you don't want a SIMD type but rather an arbitrary size integer type. Why don't you just use the "int" ("long" in Py2) type for that? It has way faster operations than your multiple copy implementation.
Of course my attached code is slow. This is precisely why I'm proposing native additions to the bytes class.
However, in most algorithms, there is a single operation like this on a block of data which is otherwise not treated as an integer. This operation often takes the form of something like:
blocks.append(blocks[-1] ^ block)
In all the surrounding code, you are dealing with bytes *as* bytes. Converting into alternate types breaks up the readability of the algorithm. And given the security requirements of such algorithms, readability is extremely important.
The above code example has both simplicity and obviousness. Currently, in py3k, this is AFAICS the best alternative for readability:
blocks.append([a ^ b for a, b in zip(blocks[-1], block)]
While this is infinitely better than Python 2.x, I think my proposal is still significantly more readable. When implemented natively, my proposal is also far more performant than this.
Also, when implemented on bytearray, you can get things like this: cksum ^= block. This can be very fast as it can be done with no copies. It is also extremely readable. Nathaniel

Nathaniel McCallum wrote:
In all the surrounding code, you are dealing with bytes *as* bytes. Converting into alternate types breaks up the readability of the algorithm. And given the security requirements of such algorithms, readability is extremely important.
Not to mention needlessly inefficient. There's also the issue that you are usually dealing with a specific number of bits. When you convert to an int, you lose any notion of it having a size, so you have to keep track of that separately, and take its effect on the bitwise operations into account manually. E.g. the bitwise complement of an N-bit string is another N-bit string. But the bitwise complement of a positive int is a bit string with an infinite number of leading 1 bits, which you have to mask off. The bitwise complement of a bytes object, on the other hand, would be another bytes object of the same size. -- Greg

On Tue, 2014-06-17 at 09:53 +1200, Greg Ewing wrote:
Nathaniel McCallum wrote:
In all the surrounding code, you are dealing with bytes *as* bytes. Converting into alternate types breaks up the readability of the algorithm. And given the security requirements of such algorithms, readability is extremely important.
Not to mention needlessly inefficient.
There's also the issue that you are usually dealing with a specific number of bits. When you convert to an int, you lose any notion of it having a size, so you have to keep track of that separately, and take its effect on the bitwise operations into account manually.
E.g. the bitwise complement of an N-bit string is another N-bit string. But the bitwise complement of a positive int is a bit string with an infinite number of leading 1 bits, which you have to mask off. The bitwise complement of a bytes object, on the other hand, would be another bytes object of the same size.
+1

On Tue, Jun 17, 2014 at 6:16 AM, Nathaniel McCallum <npmccallum@redhat.com> wrote:
Of course my attached code is slow. This is precisely why I'm proposing native additions to the bytes class.
I presume you're aware that the bytes type is immutable, right? You're still going to have at least some copying going on, whereas with a mutable type you might well be able to avoid that. Efficiency suggests bytearray instead. ChrisA

On Tue, Jun 17, 2014 at 10:00 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Chris Angelico wrote:
I presume you're aware that the bytes type is immutable, right? You're still going to have at least some copying going on, whereas with a mutable type you might well be able to avoid that. Efficiency suggests bytearray instead.
Why not both?
If you do a series of operations on a large bytes object, each one will involve a full copy. If you do the same series of operations on a large mutable object, they can be optimized down to non-copying. Why both? ChrisA

On 17 Jun 2014 10:04, "Chris Angelico" <rosuav@gmail.com> wrote:
On Tue, Jun 17, 2014 at 10:00 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Chris Angelico wrote:
I presume you're aware that the bytes type is immutable, right? You're still going to have at least some copying going on, whereas with a mutable type you might well be able to avoid that. Efficiency suggests bytearray instead.
Why not both?
If you do a series of operations on a large bytes object, each one will involve a full copy. If you do the same series of operations on a large mutable object, they can be optimized down to non-copying. Why both?
Because the two APIs are currently in sync outside mutating operations, and there isn't a compelling reason to break that symmetry, even if this proposal was put forward as a PEP and ultimately accepted. Cheers, Nick.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On Tue, Jun 17, 2014 at 4:02 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Because the two APIs are currently in sync outside mutating operations, and there isn't a compelling reason to break that symmetry, even if this proposal was put forward as a PEP and ultimately accepted.
Ah! That would be why. Sorry for the noise! ChrisA

On 17 June 2014 16:03, Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Jun 17, 2014 at 4:02 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Because the two APIs are currently in sync outside mutating operations, and there isn't a compelling reason to break that symmetry, even if this proposal was put forward as a PEP and ultimately accepted.
Ah! That would be why. Sorry for the noise!
Clarifying non-obvious design principles isn't noise on python-ideas, it's one of the reasons the list exists :) Cheers, Nick.

On Tue, Jun 17, 2014 at 6:36 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 17 June 2014 16:03, Chris Angelico <rosuav@gmail.com> wrote:
On Tue, Jun 17, 2014 at 4:02 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Because the two APIs are currently in sync outside mutating operations, and there isn't a compelling reason to break that symmetry, even if this proposal was put forward as a PEP and ultimately accepted.
Ah! That would be why. Sorry for the noise!
Clarifying non-obvious design principles isn't noise on python-ideas, it's one of the reasons the list exists :)
Then I'm glad to have been able to play the role of The Watson [1] for the benefit the audience :) ChrisA [1] http://tvtropes.org/pmwiki/pmwiki.php/Main/TheWatson

On Tue, Jun 17, 2014 at 08:59:30AM +1000, Chris Angelico wrote:
On Tue, Jun 17, 2014 at 6:16 AM, Nathaniel McCallum <npmccallum@redhat.com> wrote:
Of course my attached code is slow. This is precisely why I'm proposing native additions to the bytes class.
I presume you're aware that the bytes type is immutable, right? You're still going to have at least some copying going on, whereas with a mutable type you might well be able to avoid that. Efficiency suggests bytearray instead.
The very first sentence of Nathaniel's first post in this thread: "I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray)." So yes, I think he is aware of it :-) -- Steven

Interesting idea. I like it. I notice Python 3 has int.from_bytes() and int.to_bytes(). On Mon, Jun 16, 2014 at 3:43 PM, Nathaniel McCallum <npmccallum@redhat.com> wrote:
On Mon, 2014-06-16 at 15:20 -0400, Terry Reedy wrote:
On 6/16/2014 2:03 PM, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray).
If you are often doing and/or/xor on large arrays, as one might do for bitmap images, you should probably be using numpy or a derivative thereof.
What use do you have for shifting bits across byte boundaries, where the bytes are really bytes? Why would you not turn multiple bytes considered together into an int?
There are many reasons. Anything relating to cryptography, key derivation, asn1 BitString, etc. Many network protocols have specialized algorithms which require bit rotations or bitwise operations on blocks.
I can't think of any other reasonable use for these operators.
I don't understand this. They are routinely used on ints for various purposes.
I meant that, for instance, I can't think of any other reasonable interpretation for what "bytes() ^ bytes()" would mean other than a bitwise xor of the bytes in the arrays. Yes, of course the operators have meanings in other contexts. But in this context, I think the meaning of the operators is self-evident and precise in meaning.
Perhaps some code will clarify what I'm proposing. Attached is a class I have found continual reuse for over the last few years. It implements bitwise operators on a bytes subclass. Something similar could be done for bytearray.
Nathaniel
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

As additional input to thsi discussion I would like to remind you all that it's not a good idea to have every operator apply to every data type, as this increases the chances that bugs percolate up to a point where it's hard to figure out where an unexpected value was generated. IOW, just because there's no current meaning for e.g. b^b, that doesn't necessarily make it a good idea to add one. (There are other arguments from language usability against adding new operations indiscriminately, but this in particular jumped out at me.) -- --Guido van Rossum (python.org/~guido)

On Mon, 2014-06-16 at 13:21 -0700, Guido van Rossum wrote:
As additional input to thsi discussion I would like to remind you all that it's not a good idea to have every operator apply to every data type, as this increases the chances that bugs percolate up to a point where it's hard to figure out where an unexpected value was generated. IOW, just because there's no current meaning for e.g. b^b, that doesn't necessarily make it a good idea to add one. (There are other arguments from language usability against adding new operations indiscriminately, but this in particular jumped out at me.)
Agreed. My only thought here was that this addition seems to me to be extremely natural and emulates the precise grammar that is very often seen in algorithms in IETF RFCs (for instance). But the precise threshold of "too many operators" can be difficult to gauge. That is probably above my pay grade. :) Nathaniel

There's a bitstring package on PyPI, perhaps it has the desired operations: https://pypi.python.org/pypi/bitstring/ Regards Antoine. Le 16/06/2014 16:28, Nathaniel McCallum a écrit :
On Mon, 2014-06-16 at 13:21 -0700, Guido van Rossum wrote:
As additional input to thsi discussion I would like to remind you all that it's not a good idea to have every operator apply to every data type, as this increases the chances that bugs percolate up to a point where it's hard to figure out where an unexpected value was generated. IOW, just because there's no current meaning for e.g. b^b, that doesn't necessarily make it a good idea to add one. (There are other arguments from language usability against adding new operations indiscriminately, but this in particular jumped out at me.)
Agreed. My only thought here was that this addition seems to me to be extremely natural and emulates the precise grammar that is very often seen in algorithms in IETF RFCs (for instance). But the precise threshold of "too many operators" can be difficult to gauge. That is probably above my pay grade. :)
Nathaniel
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

16.06.14 23:38, Antoine Pitrou написав(ла):
There's a bitstring package on PyPI, perhaps it has the desired operations: https://pypi.python.org/pypi/bitstring/
And bitarray: https://pypi.python.org/pypi/bitarray

On 17 Jun 2014 05:44, "Nathaniel McCallum" <npmccallum@redhat.com> wrote:
On Mon, 2014-06-16 at 15:20 -0400, Terry Reedy wrote:
On 6/16/2014 2:03 PM, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both
and bytearray).
If you are often doing and/or/xor on large arrays, as one might do for bitmap images, you should probably be using numpy or a derivative
bytes thereof.
What use do you have for shifting bits across byte boundaries, where the bytes are really bytes? Why would you not turn multiple bytes considered together into an int?
There are many reasons. Anything relating to cryptography, key derivation, asn1 BitString, etc. Many network protocols have specialized algorithms which require bit rotations or bitwise operations on blocks.
I used to want something like this when trying to deal with bit slips on serial channels - sliding a pattern one bit to the left or right was a pain. It makes more sense on the bytes type to me than it does on multibyte array formats (which would suffer from messy endianness issues). As Nathaniel noted, there's no other obvious meaning for these operations on the binary data types, and it would definitely make bitbashing in Python easier (something that will only become more common with the rise of things like Arduino, Raspberry Pi and MicroPython). Cheers, Nick.
I can't think of any other reasonable use for these operators.
I don't understand this. They are routinely used on ints for various purposes.
I meant that, for instance, I can't think of any other reasonable interpretation for what "bytes() ^ bytes()" would mean other than a bitwise xor of the bytes in the arrays. Yes, of course the operators have meanings in other contexts. But in this context, I think the meaning of the operators is self-evident and precise in meaning.
Perhaps some code will clarify what I'm proposing. Attached is a class I have found continual reuse for over the last few years. It implements bitwise operators on a bytes subclass. Something similar could be done for bytearray.
Nathaniel
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Nathaniel McCallum, 16.06.2014 20:03:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray). I can't think of any other reasonable use for these operators. Is upstream Python interested in this kind of behavior by default? At the least, it would make many algorithms very easy to read and write.
ISTM that what you're asking for is essentially a SIMD data type, which certainly has a lot of nice applications. However, restricting it to byte values seems to be a rather niche use case to me. IMHO, this seems much better suited for the array module than the "bytes as in string" general purpose bytes type. The array module has support for all sorts of C-ish integer types. Different ways to handle errors (e.g. overflows) across the array would be another reason to not push this into the bytes type. Stefan

On 06/16/2014 11:03 AM, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray). I can't think of any other reasonable use for these operators. Is upstream Python interested in this kind of behavior by default? At the least, it would make many algorithms very easy to read and write.
I like the idea, but one question I have: when shifting, are the incoming bits set to 0 or 1? Why? -- ~Ethan~

Le 17/06/2014 15:35, Ethan Furman a écrit :
I like the idea, but one question I have: when shifting, are the incoming bits set to 0 or 1? Why?
By convention, 0. Historically, that's how CPUs do it. (and also because it provides a quick way of multiplying / dividing by 2^N). Regards Antoine.

On 2014-06-17 21:37, Antoine Pitrou wrote:
Le 17/06/2014 15:35, Ethan Furman a écrit :
I like the idea, but one question I have: when shifting, are the incoming bits set to 0 or 1? Why?
By convention, 0. Historically, that's how CPUs do it. (and also because it provides a quick way of multiplying / dividing by 2^N).
That's sometimes known as a "logical shift". When shifting to the right, there's also the "arithmetic shift", which preserves the most significant bit. Do we need that too? (I don't think so.) If yes, then what should be operator be? Just a 'normal' method call?

On 18 Jun 2014 07:34, "MRAB" <python@mrabarnett.plus.com> wrote:
On 2014-06-17 21:37, Antoine Pitrou wrote:
Le 17/06/2014 15:35, Ethan Furman a écrit :
I like the idea, but one question I have: when shifting, are the incoming bits set to 0 or 1? Why?
By convention, 0. Historically, that's how CPUs do it. (and also because it provides a quick way of multiplying / dividing by
2^N).
That's sometimes known as a "logical shift".
My bitbashing-with-Python work was all serial communications protocol based, so logical shifts were what I wanted (I was also in the fortunate position of being able to tolerate the slow speed of doing them in Python, because HF radio comms are so slow the data streams to be analysed weren't very big).
When shifting to the right, there's also the "arithmetic shift", which preserves the most significant bit.
Do we need that too? (I don't think so.) If yes, then what should be operator be? Just a 'normal' method call?
Wanting an arithmetic shift would be a sign that one is working with integers rather than arbitrary binary data, and ints or one of the fixed width types from NumPy would likely be a better fit. So leaving that out of any proposal sounds fine to me. Cheers, Nick.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 2014-06-17 23:10, Nick Coghlan wrote:
On 18 Jun 2014 07:34, "MRAB" <python@mrabarnett.plus.com> wrote:
On 2014-06-17 21:37, Antoine Pitrou wrote:
Le 17/06/2014 15:35, Ethan Furman a écrit :
I like the idea, but one question I have: when shifting, are the incoming bits set to 0 or 1? Why?
By convention, 0. Historically, that's how CPUs do it. (and also because it provides a quick way of multiplying /
That's sometimes known as a "logical shift".
My bitbashing-with-Python work was all serial communications protocol
dividing by 2^N). based, so logical shifts were what I wanted (I was also in the fortunate position of being able to tolerate the slow speed of doing them in Python, because HF radio comms are so slow the data streams to be analysed weren't very big).
When shifting to the right, there's also the "arithmetic shift", which preserves the most significant bit.
Do we need that too? (I don't think so.) If yes, then what should be operator be? Just a 'normal' method call?
Wanting an arithmetic shift would be a sign that one is working with
integers rather than arbitrary binary data, and ints or one of the fixed width types from NumPy would likely be a better fit. So leaving that out of any proposal sounds fine to me.
What about rotates?

On 18 Jun 2014 09:31, "MRAB" <python@mrabarnett.plus.com> wrote:
On 2014-06-17 23:10, Nick Coghlan wrote:
Wanting an arithmetic shift would be a sign that one is working with
integers rather than arbitrary binary data, and ints or one of the fixed width types from NumPy would likely be a better fit. So leaving that out of any proposal sounds fine to me.
What about rotates?
Bitwise rotation would be a bit of a pain to build on top of bitwise masking and logical shifts, but it could be done, so I think it would make more sense to keep a proposal minimal. Cheers, Nick.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On Wed, 2014-06-18 at 12:34 +1000, Nick Coghlan wrote:
On 18 Jun 2014 09:31, "MRAB" <python@mrabarnett.plus.com> wrote:
On 2014-06-17 23:10, Nick Coghlan wrote:
Wanting an arithmetic shift would be a sign that one is working
with integers rather than arbitrary binary data, and ints or one of the fixed width types from NumPy would likely be a better fit. So leaving that out of any proposal sounds fine to me.
What about rotates?
Bitwise rotation would be a bit of a pain to build on top of bitwise masking and logical shifts, but it could be done, so I think it would make more sense to keep a proposal minimal.
Agreed. The code that I attached to one of my early replies actually implemented rotate, but I don't think that is what should be implemented by default in this proposal. Nathaniel

On Mon, 2014-06-16 at 14:03 -0400, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray). I can't think of any other reasonable use for these operators. Is upstream Python interested in this kind of behavior by default? At the least, it would make many algorithms very easy to read and write.
So it seems to me that there is a consensus that something like this is a good idea, with perhaps the exception of Guido's reminder to not overpopulate the operators (is that a no for this proposal?). Summarizing: 1. In lshift, what bits are introduced on the right-hand side? Zero is traditional. 2. In rshift, what bits are introduced on the left-hand side? An argument can be made for either zero (logical) or retaining the left-most bit (arithmetic). The 'arithmetic shift' seems to fit the sphere of NumPy. Zero should be preferred. 3. Rotates and other common operations are out of scope for this proposal. 4. One question not discussed is what to do when attempting to and/or/xor against a bytes() or bytearray() that is of a different length. Should we left-align the shorter of the two? Right-align? Throw an exception? Also, I'm new to this process. Where should I go from here? Do I need to form a PEP? Nathaniel

Le 18/06/2014 11:35, Nathaniel McCallum a écrit :
On Mon, 2014-06-16 at 14:03 -0400, Nathaniel McCallum wrote:
I find myself, fairly often, needing to perform bitwise operations (rshift, lshift, and, or, xor) on arrays of bytes in python (both bytes and bytearray). I can't think of any other reasonable use for these operators. Is upstream Python interested in this kind of behavior by default? At the least, it would make many algorithms very easy to read and write.
So it seems to me that there is a consensus that something like this is a good idea, with perhaps the exception of Guido's reminder to not overpopulate the operators (is that a no for this proposal?).
Rather than adding new operations to bytes/bytearray, an alternative is a separate type ("bitview"?) which would take a writable buffer as argument and then provide the operations over that buffer. It would allow make the operations compatible with other writable buffer types such as numpy arrays, etc. Regards Antoine.

On Wed, Jun 18, 2014 at 11:51 AM, Antoine Pitrou <antoine@python.org> wrote:
Rather than adding new operations to bytes/bytearray, an alternative is a separate type ("bitview"?) which would take a writable buffer as argument and then provide the operations over that buffer.
+1 .. and it does not have to be part of stdlib. The advantage of implementing this outside of stdlib is that users of older versions of Python will benefit immediately.

On Wed, 2014-06-18 at 12:05 -0400, Alexander Belopolsky wrote:
On Wed, Jun 18, 2014 at 11:51 AM, Antoine Pitrou <antoine@python.org> wrote: Rather than adding new operations to bytes/bytearray, an alternative is a separate type ("bitview"?) which would take a writable buffer as argument and then provide the operations over that buffer.
+1
.. and it does not have to be part of stdlib. The advantage of implementing this outside of stdlib is that users of older versions of Python will benefit immediately.
Older versions of Python can just do: third = [a ^ b for a, b in zip(first, second)] The problem is that this is more expensive and less readable than: third = first ^ second ... or ... first ^= second I'm not making this proposal on the basis that something can't be done already, but based on the fact that implementing it natively as part of the base types is a natural growth of the language. Of course this can be implemented in a module at the cost of "batteries included," a new dependency, readability and perhaps some additional overhead. I, for one, would not use such a module and would just implement the operations myself (as I have done for the last several years). The reason for this proposal is that such operations seem to me to be extremely natural to bytes/bytearray. And I think at least some others agree. Nathaniel
participants (14)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Chris Angelico
-
Daniel Holth
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
MRAB
-
Nathaniel McCallum
-
Nick Coghlan
-
Serhiy Storchaka
-
Stefan Behnel
-
Steven D'Aprano
-
Terry Reedy