On Mon, Jan 6, 2014 at 6:24 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Abstract ========
Add ``bytes % args`` operator and ``bytes.format(args)`` method to Python 3.5.
Rationale =========
``bytes % args`` and ``bytes.format(args)`` have been removed in Python 2. This operator and this method are requested by Mercurial and Twisted developers to ease porting their project on Python 3.
Python 3 suggests to format text first and then encode to bytes. In some cases, it does not make sense because arguments are bytes strings. Typical usage is a network protocol which is binary, since data are send to and received from sockets. For example, SMTP, SIP, HTTP, IMAP, POP, FTP are ASCII commands interspersed with binary data.
Using multiple ``bytes + bytes`` instructions is inefficient because it requires temporary buffers and copies which are slow and waste memory. Python 3.3 optimizes ``str2 += str2`` but not ``bytes2 += bytes1``.
``bytes % args`` and ``bytes.format(args)`` were asked since 2008, even before the first release of Python 3.0 (see issue #3982).
``struct.pack()`` is incomplete. For example, a number cannot be formatted as decimal and it does not support padding bytes string.
Mercurial 2.8 still supports Python 2.4.
As an alternative, we could provide an import hook via some channel (cheeseshop? recipe?) that converts just b'' formatting into some Python 3 equivalent (when run under Python 3). The argument against such import hooks is usually that they have an adverse impact on the output of tracebacks. However, I'd expect most b'' formatting to happen on a single line and that the replacement source would stay on that single line. Such an import hook would lessen the desire for bytes formatting. As I mentioned elsewhere, Nick's counter-proposal of a separate wire-protocol-friendly type makes more sense to me more than adding formatting to Python 3's bytes type. As others have opined, formatting a bytes object is out of place. The need is limited in scope and audience, but apparently real. Adding that capability directly to bytes in 3.5 should be a last resort to which we appeal only when we exhaust our other options. -eric