[Python-Dev] PEP 460 reboot

Ethan Furman ethan at stoneleaf.us
Mon Jan 13 03:16:17 CET 2014


On 01/12/2014 06:07 PM, Daniel Holth wrote:
> On Sun, Jan 12, 2014 at 8:27 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> On 01/12/2014 04:47 PM, Guido van Rossum wrote:
>>>
>>>
>>> %s seems the trickiest: I think with a bytes argument it should just
>>> insert those bytes (and the padding modifiers should work too), and
>>> for other types it should probably work like %a, so that it works as
>>> expected for numeric values, and with a string argument it will return
>>> the ascii()-variant of its repr(). Examples:
>>>
>>> b'%s' % 42 == b'42'
>>> b'%s' % 'x' == b"'x'" (i.e. the three-byte string containing an 'x'
>>> enclosed in single quotes)
>>
>> I'm not sure about the quotes.  Would anyone ever actually want those in the
>> byte stream?
>
> Is there a formatting character that means "anything except a unicode
> string" to prevent accidentally interpolating a Unicode string into a
> bytes string without [a sane] encoding?

In reference to a byte stream, if you do:

--> b'%s' % 'some text'.encode('cp1241')

it's really just bytes into bytes.

If you do :

--> b'%s' % 'some text'

then the encoding is ASCII with strict error checking.  So if it's not representable as clean ASCII either encode it 
manually, or prepare for it to blow up with an UnicodeEncodeError.

--
~Ethan~


More information about the Python-Dev mailing list