[Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

Ethan Furman ethan at stoneleaf.us
Thu Mar 27 19:04:12 CET 2014


On 03/27/2014 10:55 AM, Ethan Furman wrote:
> On 03/27/2014 10:29 AM, Guido van Rossum wrote:
>>
>> I also don't understand why we can't use %b instead of %s. AFAIK %b currently doesn't mean anything and I somehow don't
>> expect we're likely to add it for other reasons (unless there's a proposal I'm missing?). Just like we use %a instead of
>> %r to remind people that it's not quite the same (since it applies .encode('ascii', 'backslashreplace')), shouldn't we
>> use anything *but* %s to remind people that that is also not the same (not at all, in fact)? The PEP's argument against
>> %b ("rejected as not adding any value either in clarity or simplicity") is hardly a good reason.
>
> The biggest reason to use %s is to support a common code base for 2/3 endeavors.  The biggest reason to not include %b
> is that it means binary number in format(); given that each type can invent it's own mini-language, this probably isn't
> a very strong argument against it.
>
> I have moderate feelings for keeping %s as a synonym for %b for backwards compatibility with Py2 code (when it's
> appropriate).

Changed to:
----------------------------------------------------------------------------------
``%b`` will insert a series of bytes.  These bytes are collected in one of two
ways:

   - input type supports ``Py_buffer`` [4]_?
     use it to collect the necessary bytes

   - input type is something else?
     use its ``__bytes__`` method [5]_ ; if there isn't one, raise a ``TypeError``

In particular, ``%b`` will not accept numbers nor ``str``.  ``str`` is rejected
as the string to bytes conversion requires an encoding, and we are refusing to
guess; numbers are rejected because:

   - what makes a number is fuzzy (float? Decimal? Fraction? some user type?)

   - allowing numbers would lead to ambiguity between numbers and textual
     representations of numbers (3.14 vs '3.14')

   - given the nature of wire formats, explicit is definitely better than implicit

``%s`` is included as a synonym for ``%b`` for the sole purpose of making 2/3 code
bases easier to maintain.  Python 3 only code should use ``%b``.

Examples::

     >>> b'%b' % b'abc'
     b'abc'

     >>> b'%b' % 'some string'.encode('utf8')
     b'some string'

     >>> b'%b' % 3.14
     Traceback (most recent call last):
     ...
     TypeError: b'%b' does not accept 'float'

     >>> b'%b' % 'hello world!'
     Traceback (most recent call last):
     ...
     TypeError: b'%b' does not accept 'str'
----------------------------------------------------------------------------------

--
~Ethan~


More information about the Python-Dev mailing list