On Thu, 17 May 2018 23:13:22 +1000, Steven D'Aprano wrote:

> No, he didn't explain the meaning. He gave an example, but not a reason
why it should do what he showed.
>
> Why should the *abstract character* 'H' XORed with the abstract
character 'w' return the abstract character '?'? Why shouldn't the
result be '>' instead?

My initial thought was that 'H' ^ 'w' -> '?' because when I was experimenting with the idea, ord('H') ^ ord('w') -> ord('?'). However, I do see your point that different encodings give different results, so I'll drop the idea of bitwise operations on strings.

> XORing code points could easily generate invalid Unicode sequences
containing lone surrogates, say, or undefined characters. Or as you
point out, out of range values.

Invalid Unicode sequences, lone surrogates, and undefined characters, IMO, are simply consequences of misusing the operators. I hadn't anticipated the ValueError for '\U00100000' and '\U00010000', though, which is another reason for me to drop bitwise operations on strings.

> But XORing bytes seems perfectly reasonable. Bytes are numbers, even if
we display them as ASCII characters.

My thought exactly.

On Thu, 17 May 2018 22:20:43 +1000, Steven D'Aprano wrote:

> What if the strings are unequal lengths?

(out-of-order quote lol)

Then the operators would raise a ValueError. (Assuming bytestrings, since again, I'm dropping text strings.)

​Sharing ideas​

Ken

​ Hilton​

;