[Python-ideas] String and bytes bitwise operations

Ken Hilton kenlhilton at gmail.com
Thu May 17 10:58:08 EDT 2018


On  Thu, 17 May 2018 23:13:22 +1000, Steven D'Aprano wrote:
> No, he didn't explain the meaning. He gave an example, but not a reason
why it should do what he showed.
>
> Why should the *abstract character* 'H' XORed with the abstract
character 'w' return the abstract character '?'? Why shouldn't the
result be '>' instead?

My initial thought was that 'H' ^ 'w' -> '?' because when I was
experimenting with the idea, ord('H') ^ ord('w') -> ord('?'). However, I do
see your point that different encodings give different results, so I'll
drop the idea of bitwise operations on strings.

> XORing code points could easily generate invalid Unicode sequences
containing lone surrogates, say, or undefined characters. Or as you
point out, out of range values.

Invalid Unicode sequences, lone surrogates, and undefined characters, IMO,
are simply consequences of misusing the operators. I hadn't anticipated the
ValueError for '\U00100000' and '\U00010000', though, which is another
reason for me to drop bitwise operations on strings.

> But XORing bytes seems perfectly reasonable. Bytes are numbers, even if
we display them as ASCII characters.

My thought exactly.

On Thu, 17 May 2018 22:20:43 +1000, Steven D'Aprano wrote:
> What if the strings are unequal lengths?

(out-of-order quote lol)
Then the operators would raise a ValueError. (Assuming bytestrings, since
again, I'm dropping text strings.)

​Sharing ideas​
,
Ken
​ Hilton​
;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180517/3e4d9c5b/attachment-0001.html>


More information about the Python-ideas mailing list