[Python-Dev] PEP 461 - Adding % and {} formatting to bytes

Eric V. Smith eric at trueblade.com
Wed Jan 15 16:52:27 CET 2014


On 1/15/2014 9:45 AM, Brett Cannon wrote:

> That's too vague; % interpolation does not support other format
> operators in the same way as str.format() does. % interpolation has
> specific code to support %d, etc. But str.format() gets supported for
> {:d} not from special code but because e.g. float.__format__('d') works.
> So you can't say "bytes.format() supports {:d} just like %d works with
> string interpolation" since the mechanisms are fundamentally different.
> 
> This is why I have argued that if you specify it as "if there is a
> format spec specified, then the return value from calling __format__()
> will have str.decode('ascii', 'strict') called on it" you get the
> support for the various number-specific format specs for free. It also
> means if you pass in a string that you just want the strict ASCII bytes
> of then you can get it with {:s}.
> 
> I also think that a 'b' conversion be added to bytes.format(). This
> doesn't have the same issue as %b if you make {} implicitly mean {!b} in
> Python 3.5 as {} will mean what is the most accurate for bytes.format()
> in either version. It also allows for explicit support where you know
> you only want a byte and allows {!s} to mean you only want a string (and
> thus throw an error otherwise).
> 
> And all of this means that much like %s only taking bytes, the only way
> for bytes.format() to accept a non-byte argument is for some format spec
> to be specified to trigger the .encode('ascii', 'strict') call.

Agreed. With %-formatting, you can start with the format strings and
then decide what you want to do with the passed in objects. But with
.format, it's the other way around: you have to look at the passed in
objects being formatted, and then decide what the format specifier means
to that type.

So, for .format, you could say "hey, that object's an int, and I happen
to know how to format ints, outside of calling it's .__format__". Or you
could even call its __format__ because you know that it will only be
ASCII. But to take this approach, you're going to have to hard-code the
types. And subclasses are probably out, since there you don't know what
the subclass's __format__ will return. It could be non-ASCII.

>>> class Int(int):
...   def __format__(self, fmt):
...     return u'foo'
...
>>> '{}'.format(Int(3))
'foo'

So basically I think we'll have to hard-code the types that .format()
will support, and never call __format__, or only call __format__ if we
know that it's a exact type where we know that __format__ will return
(strict ASCII).

Either that, or we're back to encoding the result of __format__ and
accepting that sometimes it might throw errors, depending on the values
being passed into format().

Eric.



More information about the Python-Dev mailing list