Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

26 Mar 2014

      On 03/26/2014 03:10 AM, Victor Stinner wrote:
...
2014-03-25 23:37 GMT+01:00 Ethan Furman:
...
``%a`` will call ``ascii()`` on the interpolated value.
I'm not sure that I understood correctly: is the "%a" format
supported? The result of ascii() is a Unicode string. Does it mean
that ("%a" % obj) should give the same result than
ascii(obj).encode('ascii', 'strict')?
Changed to:
-------------------------------------------------------------------------------
``%a`` will give the equivalent of
``repr(some_obj).encode('ascii', 'backslashreplace')`` on the interpolated
value.  Use cases include developing a new protocol and writing landmarks
into the stream; debugging data going into an existing protocol to see if
the problem is the protocol itself or bad data; a fall-back for a serialization
format; or any situation where defining ``__bytes__`` would not be appropriate
but a readable/informative representation is needed [8].
-------------------------------------------------------------------------------
...
Would it be possible to add a table or list to summarize supported
format characters? I found:
- single byte: %c
- integer: %d, %u, %i, %o, %x, %X, %f, %g, "etc." (can you please
complete "etc." ?)
- bytes and __bytes__ method: %s
- ascii(): %a
Changed to:
-------------------------------------------------------------------------------
%-interpolation
---------------

All the numeric formatting codes (``d``, ``i``, ``o``, ``u``, ``x``, ``X``,
``e``, ``E'', ``f``, ``F``, ``g``, ``G``, and any that are subsequently added
to Python 3) will be supported, and will work as they do for str, including
the padding, justification and other related modifiers (currently ``#``, ``0``,
``-``, `` `` (space), and ``+`` (plus any added to Python 3)).  The only
non-numeric codes allowed are ``c``, ``s``, and ``a``.

For the numeric codes, the only difference between ``str`` and ``bytes`` (or
``bytearray``) interpolation is that the results from these codes will be
ASCII-encoded text, not unicode.  In other words, for any numeric formatting
code `%x`::
-------------------------------------------------------------------------------
...
I don't understand the purpose of this sentence. Does it mean that %a
must not be used? IMO this sentence can be removed.
The sentence about %a being for debugging has been removed.
...
...
Non-ASCII values will be encoded to either ``\xnn`` or ``\unnnn``
representation.
Unicode is larger than that! print(ascii(chr(0x10ffff))) => '\U0010ffff'
Removed.  With the explicit reference to the 'backslashreplace' error handler any who want to know what it might look 
like can refer to that.
...
...
.. note::
If a ``str`` is passed into ``%a``, it will be surrounded by quotes.
And:
- bytes gets a "b" prefix and surrounded by quotes as well  (b'...')
- the quote ' is escaped as \' if the string contains quotes ' and "
Shouldn't be an issue now with the new definition which no longer references the ascii() function.
...
Can you also please add examples for %a?

Examples::

     >>> b'%a' % 3.14
     b'3.14'

     >>> b'%a' % b'abc'
     b'abc'

     >>> b'%a' % 'def'
     b"'def'"
-------------------------------------------------------------------------------
...
...
Proposed variations
===================
It would be fair to mention also a whole different PEP, Antoine's PEP 460!
My apologies for the omission.
-------------------------------------------------------------------------------
A competing PEP, ``PEP 460 Add binary interpolation and formatting`` [9], also
exists.

.. [9] http://python.org/dev/peps/pep-0460/
-------------------------------------------------------------------------------

Thank you, Victor.