On 01/18/2014 05:21 PM, Neil Schemenauer wrote:
Ethan Furman <ethan@stoneleaf.us> wrote:
So, if %a is added it would act like:
--------- "%a" % some_obj --------- tmp = str(some_obj) res = b'' for ch in tmp: if ord(ch) < 256: res += bytes([ord(ch)] else: res += unicode_escape(ch) ---------
where 'unicode_escape' would yield something like "\u0440" ?
My patch on the tracker already implements %a, it's simple.
Before one implements a patch it is good to know the specifications.
Just call PyObject_ASCII() (same as ascii()) then call PyUnicode_AsLatin1String(s) to convert it to bytes and stick it in. PyObject_ASCII does not return non-ASCII characters, no decode error is possible. We could call _PyUnicode_AsASCIIString(s, "strict") instead if we are afraid for non-ASCII bytes coming out of PyObject_ASCII.
I appreciate that this is the behavior you want, but I'm not sure it's the behavior Nick was describing. -- ~Ethan~