[Python-ideas] Binary f-strings

Chris Angelico rosuav at gmail.com
Mon Sep 28 05:03:32 CEST 2015


On Mon, Sep 28, 2015 at 12:41 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Naively, I'd expect that since f-strings and .format share the same
> infrastructure, fb-strings should work the same way as bytes.format --
> and in particular, either both should be supported or neither. Since
> bytes.format apparently got rejected during the PEP 460/PEP 461
> discussions:
>     https://bugs.python.org/issue3982#msg224023
> I guess you'd need to dig up those earlier discussions and see what
> the issues were?

The biggest issues are summarized into PEP 461:

https://www.python.org/dev/peps/pep-0461/#proposed-variations

Since the __format__ machinery is all based around text strings,
there'll need to be some (explicit or implicit) encode step. Hence
this thread.

How bad would it be to simply say "there are no bf strings"? As Steven
says, you can simply use a normal f''.encode() operation, with no
confusion. Otherwise, there'll be these "format-like" operations that
can do things that format() can't do... and then there'd be edge
cases, too, like a string with a b-prefix that contains non-ASCII
characters in it:

>>> восток = 1961
>>> apollo = 1969
>>> print(f"It took {apollo-восток} years to get from orbit to the moon.")
It took 8 years to get from orbit to the moon.
>>> print(b"It took {apollo-восток} years to get from orbit to the moon.")
  File "<stdin>", line 1
SyntaxError: bytes can only contain ASCII literal characters.

If that were a binary f-string, those Cyrillic characters should still
be legal (as they define an identifier, rather than ending up in the
code). Would it confuse (a) humans, or (b) tools, to have these "texty
bits" inside a byte string?

In any case, bf strings can be added later, but once they're added,
their semantics would be locked in. I'd be inclined to leave them out
for 3.6 and see what people say. A bit of real-world usage of
f-strings might show a clear front-runner in terms of expectations
(UTF-8, ASCII, or something else).

ChrisA


More information about the Python-ideas mailing list