[Python-ideas] Stop displaying elements of bytes objects as printable ASCII characters in CPython 3

Steven D'Aprano steve at pearwood.info
Wed Sep 10 12:57:32 CEST 2014


On Wed, Sep 10, 2014 at 12:04:23AM -0700, Chris Lasher wrote:

[...]
> I would like to gauge the feasibility of a PEP to change the printable
> representation of bytes in CPython 3 to display all elements by their
> hexadecimal values, and only by their hexadecimal values.

I'm very sympathetic to this "purity" approach. I too consider it a 
shame that the repr of byte-strings in Python 3 pretends to be 
ASCII-ish[1], regardless of the source of the bytes. Alas, not only do 
we have backward compatibility to consider -- there are now five versions 
of Python 3 where bytes display as ASCII -- but practicality as well. 
There are many use-cases where human-readable ASCII bytes are embedded 
inside otherwise binary bytes. To my regret, I don't think purity 
arguments are strong enough to justify a change.

However, I do support Terry's suggestion that bytes (and, I presume, 
bytearray) grow some sort of easy way of displaying the bytes in hex. 
The trouble is, what do we actually want?

b'Abc' --> '0x416263'
b'Abc' --> '\x41\x62\x63'

I can see use-cases for both. After less than two minutes of thought, it 
seems to me that perhaps the most obvious APIs for these two different 
representations are:

hex(b'Abc') --> '0x416263'
b'Abc'.decode('hexescapes') --> '\x41\x62\x63'




[1] They're not *strictly* ASCII, since ASCII doesn't support ordinal 
values above 127.

-- 
Steven


More information about the Python-ideas mailing list