[Python-Dev] PEP 460 reboot

Tue Jan 14 03:55:06 CET 2014

On 2014-01-14 02:25, Terry Reedy wrote:
> On 1/13/2014 4:32 PM, Guido van Rossum wrote:
>
>   > I will doggedly keep posting to this thread rather than creating more
> threads.
>
> Please permit to to doggedly keep pointing you toward the possible
> solution I posted on the tracker last October.
>
>> But formatb() feels absurd to me. PEP 460 has neither a precise
>> specification or any actual examples, so I can't tell whether the
>
> Two days ago, I reposted byteformat() here on pydev with a precise text
> specification added to the code, and with an expanded test example. I
> have just added another example based on your question below.
>
>> intention is that the format string can *only* contain {...} sequences
>> or whether it can also contain "regular" characters. Translating to
>> formatb(), my question comes down to the legality of the following
>> example:
>>
>>    b'Hello, {}'.formatb(name)  # Where name is some bytes object
>>
>> If this is allowed, it reintroduces the ASCII bias (since the
>> substring 'Hello' is clearly ASCII).
>
> Since byteformat() uses re to find {<format-spec>} replacement fields,
> it only has such ascii bias as re has, which I believe is not much, if
> any. As far as re and byteformat are concerned, everything outside of
> the {...} fields is uninterpreted bytes. As far as bytes.join is
> concerned, both joiner and joined are uninterpreted bytes.
>
>   >>> byteformat(b'\x00{}\x02{}def', (b'\x01', b'abc',))
> b'\x00\x01\x02abcdef'
>
[snip]
Couldn't that suffer from false positives, i.e. binary data that
happens to match? (Rare, yes, but possible.)