On 09/10/2014 12:57 PM, Steven D'Aprano wrote:
On Wed, Sep 10, 2014 at 12:04:23AM -0700, Chris Lasher wrote:
[...]
I would like to gauge the feasibility of a PEP to change the printable representation of bytes in CPython 3 to display all elements by their hexadecimal values, and only by their hexadecimal values.
I'm very sympathetic to this "purity" approach. I too consider it a shame that the repr of byte-strings in Python 3 pretends to be ASCII-ish[1], regardless of the source of the bytes. Alas, not only do we have backward compatibility to consider -- there are now five versions of Python 3 where bytes display as ASCII -- but practicality as well. There are many use-cases where human-readable ASCII bytes are embedded inside otherwise binary bytes. To my regret, I don't think purity arguments are strong enough to justify a change.
However, I do support Terry's suggestion that bytes (and, I presume, bytearray) grow some sort of easy way of displaying the bytes in hex. The trouble is, what do we actually want?
b'Abc' --> '0x416263' b'Abc' --> '\x41\x62\x63'
I can see use-cases for both. After less than two minutes of thought, it seems to me that perhaps the most obvious APIs for these two different representations are:
hex(b'Abc') --> '0x416263'
This would require a change in the documented (https://docs.python.org/3/library/functions.html#hex) behavior of hex(), which I think is quite a big deal for a relatively special case.
b'Abc'.decode('hexescapes') --> '\x41\x62\x63'
This, OTOH, looks elegant (avoids a new method) and clear (no doubt about the returned type) to me. +1
Wolfgang