Re: [Python-Dev] PEP 461 Final?

18 Jan 2014

      On 01/18/2014 05:48 AM, Nick Coghlan wrote:
...
On 18 Jan 2014 11:52, "Ethan Furman" wrote:
...
I'll admit to being somewhat on the fence about %a.
It seems there are two possibilities with %a:
1) have it be ascii(repr(obj))
2) have it be str(obj).encode('ascii', 'strict')
This gets very close to crossing the line into implicit encoding of text again. Binary interpolation is being added back
for the specific use case of working with ASCII compatible segments in binary formats, and it's at best arguable that
supporting %a will help with that use case.
Agreed.
...
However, without it, there may be a greater temptation to inappropriately define __bytes__ just to support binary
interpolation, rather than because a type truly has an appropriate translation directly to bytes.
True.
...
By allowing %a, we avoid that temptation. This is also potentially useful specifically in the case of binary logging
formats and as a quick way to request backslash escaping of non-ASCII characters in text.
Call it +0.5 for allowing %a. I don't expect it to be used heavily, but I think it will head off a fair bit of potential
misuse of __bytes__.
So, if %a is added it would act like:

---------
   "%a" % some_obj
---------
   tmp = str(some_obj)
   res = b''
   for ch in tmp:
       if ord(ch) < 256:
           res += bytes([ord(ch)]
       else:
           res += unicode_escape(ch)
---------

where 'unicode_escape' would yield something like "\u0440" ?

--
~Ethan~