[Python-ideas] Stop displaying elements of bytes objects as printable ASCII characters in CPython 3
Nick Coghlan
ncoghlan at gmail.com
Thu Sep 11 04:35:29 CEST 2014
On 11 September 2014 11:57, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Folks should keep in mind that when we talk about "hybrid ASCII binary
> data", we're not just talking about things like SMTP and HTTP 1.1 and
> debugging network protocol traffic, we're also talking about things
> like URLs, filesystem paths, email addresses, environment variables,
> command line arguments, process names, passing UTF-8 encoded data to
> GUI frameworks, etc that are often both ASCII compatible and human
> readable *by design*.
>
> Note the error message produced here with my modified build:
>
> $ ./python -c 'import os; print(os.listdir(b"foo"))'
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> FileNotFoundError: [Errno 2] No such file or directory: b'Zfoo'
>
> And this directory listing:
>
> $ ./python -c 'import os; print(os.listdir(b"Mac"))'
> [b'ZIDLE', b'ZMakefile.in', b'ZTools', b'ZREADME.orig',
> b'ZPythonLauncher', b'ZIcons', b'ZREADME', b'ZExtras.install.py',
> b'ZBuildScript', b'ZResources']
After posting that version, I realised actually making the proposed
change would be similarly straightforward, and better illustrate the
core problem with the idea:
$ ./python -c 'import os; print(os.listdir(b"foo"))'
Traceback (most recent call last):
File "<string>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: b'\x66\x6f\x6f'
$ ./python -c 'import os; print(os.listdir(b"Mac"))'
[b'\x49\x44\x4c\x45', b'\x4d\x61\x6b\x65\x66\x69\x6c\x65\x2e\x69\x6e',
b'\x54\x6f\x6f\x6c\x73',
b'\x52\x45\x41\x44\x4d\x45\x2e\x6f\x72\x69\x67',
b'\x50\x79\x74\x68\x6f\x6e\x4c\x61\x75\x6e\x63\x68\x65\x72',
b'\x49\x63\x6f\x6e\x73', b'\x52\x45\x41\x44\x4d\x45',
b'\x45\x78\x74\x72\x61\x73\x2e\x69\x6e\x73\x74\x61\x6c\x6c\x2e\x70\x79',
b'\x42\x75\x69\x6c\x64\x53\x63\x72\x69\x70\x74',
b'\x52\x65\x73\x6f\x75\x72\x63\x65\x73']
vs
$ python3 -c 'import os; print(os.listdir(b"foo"))'
Traceback (most recent call last):
File "<string>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'foo'
$ python3 -c 'import os; print(os.listdir(b"Mac"))'
[b'IDLE', b'Makefile.in', b'Tools', b'README.orig', b'PythonLauncher',
b'Icons', b'README', b'Extras.install.py', b'BuildScript',
b'Resources']
It's more than just a matter of backwards compatibility, it's a matter
of asymmetry of impact when the two possible design choices are wrong:
* Using a hex based repr when an ASCII based repr is more appropriate
is utterly unreadable
* Using an ASCII based repr when a hex based repr is more appropriate
is somewhat confusing
This kind of thing is why the original "binary representation by
default" design didn't survive the Python 3.0 development cycle - once
people started trying it out, it quickly became evident that it was
the wrong approach to take (if I remember the original implementation
correctly, the repr was along the lines of "bytes([1, 2, 3, 4])" since
there wasn't a bytes literal until after PEP 3137 was implemented).
Making hex representations of binary data easier to produce is still a
good idea, though.
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-ideas
mailing list