[Python-Dev] PEP 461 - Adding % and {} formatting to bytes

Ethan Furman ethan at stoneleaf.us
Tue Jan 14 23:22:18 CET 2014


On 01/14/2014 02:17 PM, Nick Coghlan wrote:
>
> On 15 Jan 2014 07:36, "Ethan Furman" <ethan at stoneleaf.us <mailto:ethan at stoneleaf.us>> wrote:
>>
>> On 01/14/2014 12:57 PM, Antoine Pitrou wrote:
>>>
>>> On Tue, 14 Jan 2014 11:56:25 -0800
>>> Ethan Furman <ethan at stoneleaf.us <mailto:ethan at stoneleaf.us>> wrote:
>>>>
>>>>
>>>> %s, because it is the most general, has the most convoluted resolution:
>>>>
>>>>     - input type is bytes?
>>>>       pass it straight through
>>>
>>>
>>> It should try to get a Py_buffer instead.
>>
>>
>> Meaning any bytes or bytes-subtype will support the Py_buffer protocol, and this should be the first thing we try?
>>
>> Sounds good.
>>
>> For that matter, should the first test be "does this object support Py_buffer" and not worry about it being isinstance(obj, bytes)?
>
> Yep. I actually suggest adjusting the %s handling to:
>
> - interpolate Py_buffer exporters directly
> - interpolate __bytes__ if defined
> - reject anything with an "encode" method
> - otherwise interpolate str(obj).encode("ascii")
>
>>>>     - input type is numeric?
>>>>       use its __xxx__ [1] [2] method and ascii-encode it (strictly)
>>>
>>>
>>> What is the definition of "numeric"?
>>
>>
>> That is a key question.
>
> As suggested above, I would flip the question and explicitly *disallow* implicit encoding of any object with its own
> "encode" method, while allowing everything else.

Um, int and floats (for example) don't have an .encode method, don't export Py_buffer, don't have a __bytes__ method... 
Ah! so it would hit the last case, I see.

The danger I see with that route is that any ol' object could then make it into the byte stream, and considering what 
byte streams are for I think we should make the barrier for entry higher than just relying on a __str__ or __repr__.

--
~Ethan~


More information about the Python-Dev mailing list