On 03/26/2014 03:10 AM, Victor Stinner wrote:
2014-03-25 23:37 GMT+01:00 Ethan Furman:
``%a`` will call ``ascii()`` on the interpolated value.
I'm not sure that I understood correctly: is the "%a" format supported? The result of ascii() is a Unicode string. Does it mean that ("%a" % obj) should give the same result than ascii(obj).encode('ascii', 'strict')?
Changed to: ------------------------------------------------------------------------------- ``%a`` will give the equivalent of ``repr(some_obj).encode('ascii', 'backslashreplace')`` on the interpolated value. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or any situation where defining ``__bytes__`` would not be appropriate but a readable/informative representation is needed [8]. -------------------------------------------------------------------------------
Would it be possible to add a table or list to summarize supported format characters? I found:
- single byte: %c - integer: %d, %u, %i, %o, %x, %X, %f, %g, "etc." (can you please complete "etc." ?) - bytes and __bytes__ method: %s - ascii(): %a
Changed to: ------------------------------------------------------------------------------- %-interpolation --------------- All the numeric formatting codes (``d``, ``i``, ``o``, ``u``, ``x``, ``X``, ``e``, ``E'', ``f``, ``F``, ``g``, ``G``, and any that are subsequently added to Python 3) will be supported, and will work as they do for str, including the padding, justification and other related modifiers (currently ``#``, ``0``, ``-``, `` `` (space), and ``+`` (plus any added to Python 3)). The only non-numeric codes allowed are ``c``, ``s``, and ``a``. For the numeric codes, the only difference between ``str`` and ``bytes`` (or ``bytearray``) interpolation is that the results from these codes will be ASCII-encoded text, not unicode. In other words, for any numeric formatting code `%x`:: -------------------------------------------------------------------------------
I don't understand the purpose of this sentence. Does it mean that %a must not be used? IMO this sentence can be removed.
The sentence about %a being for debugging has been removed.
Non-ASCII values will be encoded to either ``\xnn`` or ``\unnnn`` representation.
Unicode is larger than that! print(ascii(chr(0x10ffff))) => '\U0010ffff'
Removed. With the explicit reference to the 'backslashreplace' error handler any who want to know what it might look like can refer to that.
.. note::
If a ``str`` is passed into ``%a``, it will be surrounded by quotes.
And:
- bytes gets a "b" prefix and surrounded by quotes as well (b'...') - the quote ' is escaped as \' if the string contains quotes ' and "
Shouldn't be an issue now with the new definition which no longer references the ascii() function.
Can you also please add examples for %a?
Examples:: >>> b'%a' % 3.14 b'3.14' >>> b'%a' % b'abc' b'abc' >>> b'%a' % 'def' b"'def'" -------------------------------------------------------------------------------
Proposed variations ===================
It would be fair to mention also a whole different PEP, Antoine's PEP 460!
My apologies for the omission. ------------------------------------------------------------------------------- A competing PEP, ``PEP 460 Add binary interpolation and formatting`` [9], also exists. .. [9] http://python.org/dev/peps/pep-0460/ ------------------------------------------------------------------------------- Thank you, Victor.