[Python-Dev] PEP 460 reboot

Tue Jan 14 17:30:01 CET 2014

On Mon, Jan 13, 2014 at 5:14 PM, Guido van Rossum <guido at python.org> wrote:

> On Mon, Jan 13, 2014 at 2:05 PM, Brett Cannon <brett at python.org> wrote:
> > I have been going on the assumption that bytes.format() would change what
> > '{}' meant for itself and would only interpolate bytes. That convenient
> > between Python 2 and 3 since it represents what we want it to (str and
> bytes
> > under the hood, respectively), so it just falls through. We could also
> add a
> > 'b' conversion for bytes() explicitly so as to help people not
> accidentally
> > mix up things in bytes.format() and str.format(). But I was not
> suggesting
> > adding a specific format spec for bytes but instead making bytes.format()
> > just do the .encode('ascii') automatically to help with compatibility
> when a
> > format spec was present. If people want fancy formatting for bytes they
> can
> > always do it themselves before calling bytes.format().
>
> This seems hastily written (e.g. verb missing :-), and I'm not clear
> on what you are (or were) actually proposing. When exactly would
> bytes.format() need .encode('ascii')?
>
> I would be happy to wait a few hours or days for you to to write it up
> clearly, rather than responding in a hurry.

Sorry about that. Busy day at work + trying to stay on top of this entire
conversation was a bit tough. Let me try to lay out what I'm suggesting for
bytes.format() in terms of how it changes
http://docs.python.org/3/library/string.html#format-string-syntax for bytes.

1. New conversion operator of 'b' that operates as PEP 460 specifies (i.e.
tries to get a buffer, else calls __bytes__). The default conversion
changes from 's' to 'b'.
2. Use of the conversion field adds an added step of calling
str.encode('ascii', 'strict') on the result returned from calling
__format__().

That's it. So point 1 means that the following would work in Python 3.5::

  b'Hello, {}, how are you?'.format(b'Guido')
  b'Hello, {!b}, how are you?'.format(b'Guido')

It would produce an error if you used a text argument for 'Guido' since str
doesn't define __bytes__ or a buffer. That gives the EIBTI group their
bytes.format() where nothing magical happens.

For point 2, let's say you have the following in Python 2::

  'I have {} bottles of beer on the wall'.format(10)

Under my proposal, how would you change it to get the same result in Python
2 and 3?::

  b'I have {:d} bottles of beer on the wall'.format(10)

In Python 2 you're just being more explicit about the format, otherwise
it's the same semantics as today. In Python 3, though, this would translate
into (under the hood)::

  b'I have {} bottles of beer on the wall'.format(format(10,
'd').encode('ascii', 'strict'))

This leads to the same bytes value in Python 2 (since it's just a string)
and in Python 3 (as everything accepted by bytes.format() is either bytes
already or converted to from encoding to ASCII bytes). While Python 2 users
would need to make sure they used a format spec to get the same result in
both Python 2 and 3 for ASCII bytes, it's a minor change which also makes
the format more explicit so it's not an inherently bad thing. And for those
that don't want to utilize the automatic ASCII encoding they can just not
use a format spec in the format string and just pass in bytes directly
(i.e. call __format__() themselves and then call str.encode() on their
own). So PBP people get to have a simple way to use bytes.format() in
Python 2 and 3 when dealing with things that can be represented as ASCII
(just as the bytes methods allow for currently).

I think this covers your desire to have numbers and anything else that can
be represented as ASCII be supported for easy porting while covering my
desire that any automatic encoding is clearly explicit in the format string
and in no way special-cased for only some types (the introduction of a 'c'
converter from PEP 460 is also fine with me).

How you would want to translate this proposal with the % operator I'm not
sure since it has been quite a while since I last seriously used it and so
I don't think I'm in a good position to propose a shift for it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140114/a5ec0d92/attachment.html>