[Python-Dev] PEP 460 reboot

Nick Coghlan ncoghlan at gmail.com
Mon Jan 13 18:12:51 CET 2014


On 14 January 2014 01:54, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 01/13/2014 01:13 AM, Nick Coghlan wrote:
>
>> On 13 Jan 2014 17:43, "Ethan Furman" wrote:
>>>
>>> On 01/12/2014 10:51 PM, Nick Coghlan wrote:
>>>>
>>>>
>>>> I am a strong -1 on the more lenient proposal, as it makes binary
>>>> interpolation in Python 3 an *unsafe operation* for ASCII incompatible
>>>> binary formats.
>>>
>>>
>>> No more unsafe that calling .upper() on ASCII incompatible streams.
>>
>>
>> Right - Guido's proposal is *completely useless* for arbitrary binary
>> data. You can't trust it.
>
>
> Forgive me for being dense, but I don't understand your objection.  With
> Guido's proposal, '%s' % bytes_data, bytes_data is passed through unchanged.
> Did you mean something else by "binary data"?

I mean it will work, but it will mean you've introduced an implicit
assumption of ASCII compatibility into the structure your program,
with no straightforward way of removing it (you would have to rewrite
your code to not rely on interpolation). This becomes most obvious
when the formatting string is passed as a variable, rather than being
provided as a literal, or when you don't know the type of the *value*
provided and some types may involved implicit encoding operation (I
don't think Guido proposed that, but others have). That's the kind of
data driven uncertainty I don't like in Python 2, and I find it's
categorical elimination to be one of the best features of Python 3 -
there are certain kinds of data manipulation bugs that simply *can't
exist* because the types don't work that way any more.

However, that's also why *adding* formatb/formatb_map to the proposal
(with Antoine's stricter semantics) would resolve my concerns - you
can ensure you don't introduce an implicit assumption of ASCII
compatibility by using those for interpolation rather than the ASCII
compatible __mod__/format/format_map that the bytes type will share
with the str type.

The combination of the two is completely in keeping with the Python 3
text model - we would offer text interpolation, hybrid ASCII
compatible interpolation *and* pure binary interpolation. Offering
only the first two would mean relegating the pure binary domain to a
lower status again, since assuming ASCII compatibility would grant you
access to an interpolation API, so people would be inclined to use it
even when doing so opens the door to data corruption bugs.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list