On 11 September 2014 10:42, Chris Angelico firstname.lastname@example.org wrote:
On Thu, Sep 11, 2014 at 4:35 AM, Chris Lasher email@example.com wrote:
Unless printable representation of bytes objects appears as part of the language specification for Python 3, it's an implementation detail, thus, it is a candidate for change, especially if the BDFL wills it so.
So this is all about the output of repr(), right? The question then is: How important is backward compatibility with repr? Will there be code breakage?
I changed PyBytes_Repr to inject a 'Z' after the opening quote to see just how extensive the damage would be in CPython's own regression test suite (as I belatedly realised the magnitude of the impact may not be obvious to everyone, so I figured it was worth quantifying):
355 tests OK. 17 tests failed: test_base64 test_bytes test_configparser test_ctypes test_doctest test_file_eintr test_hash test_io test_pdb test_pickle test_pickletools test_re test_smtpd test_subprocess test_sys test_telnetlib test_tools 1 test altered the execution environment: test_warnings 17 tests skipped: test_curses test_devpoll test_kqueue test_msilib test_ossaudiodev test_smtpnet test_socketserver test_startfile test_timeout test_tk test_ttk_guionly test_urllib2net test_urllibnet test_winreg test_winsound test_xmlrpc_net test_zipfile64
I ran those tests without enabling *any* of the optional resources (and the Windows specific tests won't run on my machine).
Folks should keep in mind that when we talk about "hybrid ASCII binary data", we're not just talking about things like SMTP and HTTP 1.1 and debugging network protocol traffic, we're also talking about things like URLs, filesystem paths, email addresses, environment variables, command line arguments, process names, passing UTF-8 encoded data to GUI frameworks, etc that are often both ASCII compatible and human readable *by design*.
Note the error message produced here with my modified build:
$ ./python -c 'import os; print(os.listdir(b"foo"))' Traceback (most recent call last): File "<string>", line 1, in <module> FileNotFoundError: [Errno 2] No such file or directory: b'Zfoo'
And this directory listing:
$ ./python -c 'import os; print(os.listdir(b"Mac"))' [b'ZIDLE', b'ZMakefile.in', b'ZTools', b'ZREADME.orig', b'ZPythonLauncher', b'ZIcons', b'ZREADME', b'ZExtras.install.py', b'ZBuildScript', b'ZResources']
Python 3 carved out a whole lot of text processing operations and said "these are clearly and unambiguous working with text data, we shouldn't confuse them with binary data manipulation". The remaining ambiguity in the behaviour of the Python 3 bytes type is largely inherent in the way computers currently work - there's no getting away from it.