On Thu, May 17, 2018 at 11:07 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Steven D'Aprano wrote:
But XORing bytes seems perfectly reasonable. Bytes are numbers, even if we display them as ASCII characters.

actually, bytes are, well, bytes ;-) -- that is, 8 bits. But the point is that "bitwise" operations make all the sense in the world for bytes, but not for unicode text -- did anyone have a use case for bitwise operation on unicode strings? I can't imagine one, even if you could agree on a definition...

Yep. Implement it for bytes, 

What exactly would be implemented? 

bytes is a sequence type, so would a bitwise operator perform the operation "elementwise"? (i.e. like numpy)

or would it operate on the whole sequence as a single collection of bits?

Would it be any different? Without thinking hard, it seems some operations, like AND and OR and XOR would be the same, but bit shifting would be different.

And then what do you do if the two bytes objects are not the same length?

If "elementwise", then we should think carefully about that -- no where else does Python do things elementwise in the standard library -- and we already have numpy if you want to do it now.

and a bytes object can be representing any type of data -- so one byte at a time might make sense, but maybe two bytes at a time makes more sense -- and if the data is representing, say an integer, then endian-ness matters...

All this is handled by numpy by having multiple data types that can be "mapped" to a buffer.

-CHB


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov