[Python-Dev] PEP 460 reboot

Daniel Holth dholth at gmail.com
Mon Jan 13 21:07:47 CET 2014


I see it now. b"foo%sbar" % b'baz' should also expand to b"foob'foo'bar"

Instead of "%b" could "%j" mean "I should have used + or join() here
but was too lazy" and work on str too?

On Mon, Jan 13, 2014 at 2:51 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 1/13/2014 1:40 PM, Brett Cannon wrote:
>
>> > So bytes formatting really needn't (and shouldn't, IMO) mirror str
>> > formatting.
>
>
> This was my presumption in writing byteformat().
>
>
>> I think one of the things about Guido's proposal that bugs me is that it
>> breaks the mental model of the .format() method from str in terms of how
>> the mini-language works. For str.format() you have the conversion and
>> the format spec (e.g. "{!r}" and "{:d}", respectively). You apply the
>> conversion by calling the appropriate built-in, e.g. 'r' calls repr().
>> The format spec semantically gets passed with the object to format()
>> which calls the object's __format__() method: ``format(number, 'd')``.
>>
>> Now Guido's suggestion has two parts that affect the mini-language for
>> .format(). One is that for bytes.format() the default conversion is
>> bytes() instead of str(), which is fine (probably want to add 'b' as a
>> conversion value as well to be consistent). But the other bit is that
>> the format spec goes from semantically meaning ``format(thing,
>> format_spec)`` to ``format(thing, format_spec).encode('ascii',
>> 'strict')`` for at least numbers. That implicitness bugs me as I have
>> always thought of format specs just leading to a call to format(). I
>> think I can live with it, though, as long as it is **consistently**
>> applied across the board for bytes.format(); every use of a format spec
>> leads to calling ``format(thing, format_spec).encode('ascii',
>> 'strict')`` no matter what type 'thing' would be and it is clearly
>> documented that this is done to ease porting and handle the common case
>> then I can live with it.
>
>
> This is how my byteformat function works, except that when no format_spec is
> given, byte and bytearrary objects are left unchanged rather than being
> decoded and encoded again.
>
>
>> This even gives people in-place ASCII encoding for strings by always
>> using '{:s}' with text which they can do when they port their code to
>> run under both Python 2 and 3. So you should be able to do
>> ``b'Content-Type: {:s}'.format('image/jpeg')`` and have it give ASCII.
>> If you want more explicit encoding to latin-1 then you need to do it
>> explicitly and not rely on the mini-language to do tricks for you.
>>
>> IOW I want to treat the format mini-language as a language and thus not
>> have any special-casing or massive shifts in meaning between
>> str.format() and bytes.format() so my mental model doesn't have to
>> contort based on whether it's str or bytes. My preference is not have
>> any, but if Guido is going say PBP here then I want absolute consistency
>> across the board in how bytes.format() tweaks things.
>>
>> As for %s for the % operator calling ascii(), I think that will be a
>> porting nightmare of finding out why your bytes suddenly stopped being
>> formatted properly and then having to crawl through all of your code for
>> that one use of %s which is getting bytes in. By raising a TypeError you
>> will very easily detect where your screw-up occurred thanks to the
>> traceback; do so otherwise feels too much like implicit type conversion
>> and ask any JavaScript developer how that can be a bad thing.
>
>
> I personally would not add 'bytes % whatever'.
>
> --
> Terry Jan Reedy
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com


More information about the Python-Dev mailing list