[Python-Dev] PEP 460 reboot

Mon Jan 13 10:13:48 CET 2014

On 13 Jan 2014 17:43, "Ethan Furman" <ethan at stoneleaf.us> wrote:
>
> On 01/12/2014 10:51 PM, Nick Coghlan wrote:
>>
>>
>> I am a strong -1 on the more lenient proposal, as it makes binary
>> interpolation in Python 3 an *unsafe operation* for ASCII incompatible
>> binary formats.
>
>
> No more unsafe that calling .upper() on ASCII incompatible streams.

Right - Guido's proposal is *completely useless* for arbitrary binary data.
You can't trust it.

However, Python 3 has no equivalent binary interpolation feature that *is*
safe for arbitrary binary data, so the lenient version *will* be a bug
magnet if it is the only version of binary interpolation provided.

However, if new formatb and formatb_map methods were included in the
proposal with the current strict PEP 460 semantics, then my objections
would be reduced substantially. In that case, we'd still be providing the
new binary interpolation feature *in addition* to restoring the ASCII
compatible interpolation feature, so the latter would be less of an
attractive nuisance when writing code that needs to handle arbitrary binary
formats and can't assume ASCII compatibility.

With that approach, I'd even support the idea of implicit strict ASCII
encoding of text inputs for the ASCII compatible version.

>
>
>
>> The existing binary operations that assume ASCII do so *inherently* -
>> they're not input driven, the operation itself assumes ASCII, so if
>> you're working with data that may not be ASCII compatible, you simply
>> don't use them (these are operations like title(), upper(), lower(),
>> the default arguments for split() and strip(), etc).
>
>
> How is this different from not using % interpolation when the byte stream
is incompatible?  It isn't.

Because I *want to use* the PEP 460 binary interpolation API, but wouldn't
be able to use Guido's more lenient proposal, as it is a bug magnet in the
presence of arbitrary binary data. Provide both APIs and my objections go
away - ASCII interpolation just becomes another way to translate between
structured and text data, while binary interpolation would be a strictly
binary only operation.

>
> And what do you mean by "input driven"?  If the LHS is bytes, the result
is bytes, no matter what the input is.  This is not the Py2 world where you
may end up with str or unicode; you always end up with bytes if the LHS is
bytes.

The LHS may or may not be tainted with assumptions about ASCII
compatibility, which means it effectively *is* tainted with such
assumptions, which means code that needs to handle arbitrary binary data
can't use it and is left without a binary interpolation feature.

That's why *adding* formatb to Guido's more lenient proposal resolves my
objections: it provides the binary interpolation feature I want, and
maintains Python 3's clear distinction between the text domain and the
binary domain.

Cheers,
Nick.

>
> [snip the rest that seems to flow from these misunderstandings]
>
> --
> ~Ethan~
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140113/3f1d265a/attachment.html>