Mailman 3 PEP-498: Literal String Formatting - Python-Dev

PEP-498: Literal String Formatting

Eric V. Smith

7 Aug 2015 7 Aug '15

6:39 p.m.

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting". I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...

...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback. -- Eric.

Show replies by date

Nick Coghlan

8 Aug 8 Aug

2:34 a.m.

On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com> wrote:

...

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

Thanks you for your work on this - it's a very cool concept! I've also now written and posted an initial draft of PEP 500, based directly on PEP 498, which formalises the "__interpolate__" builtin idea I raised in those threads, along with a PEP 292 based syntax proposal that aims to be as simple as possible for the simple case of interpolating existing variables, while still allowing the use of braces to permit embedding of arbitrary expressions and formatting directives. it turned out this approach provided an unanticipated benefit that I only discovered while writing the PEP: by defining a separate "__interpolateb__" builtin, it's straightforward to define binary interpolation in terms of bytes.__mod__, while still defining text interpolation in terms of str.format. The previously-redundant-in-python-3 'u' prefix also finds new life as a way of always requesting the default string interpolation, even if __interpolate__ has been overridden in the current namespace to mean something else (like il8n string translation). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Alexander Walters

7:05 a.m.

Please do not change the meaning of the vestigial U''. It was re-added to the language to fix a problem, rebinding it to another meaning introduces new problems. We have plenty of other letters in the alphabet to use. On 8/8/2015 05:34, Nick Coghlan wrote:

...

On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com> wrote:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting". Thanks you for your work on this - it's a very cool concept!

I've also now written and posted an initial draft of PEP 500, based directly on PEP 498, which formalises the "__interpolate__" builtin idea I raised in those threads, along with a PEP 292 based syntax proposal that aims to be as simple as possible for the simple case of interpolating existing variables, while still allowing the use of braces to permit embedding of arbitrary expressions and formatting directives.

it turned out this approach provided an unanticipated benefit that I only discovered while writing the PEP: by defining a separate "__interpolateb__" builtin, it's straightforward to define binary interpolation in terms of bytes.__mod__, while still defining text interpolation in terms of str.format.

The previously-redundant-in-python-3 'u' prefix also finds new life as a way of always requesting the default string interpolation, even if __interpolate__ has been overridden in the current namespace to mean something else (like il8n string translation).

Cheers, Nick.

Nick Coghlan

8:07 a.m.

On 9 August 2015 at 00:05, Alexander Walters <tritium-list@sdamon.com> wrote:

...

Please do not change the meaning of the vestigial U''. It was re-added to the language to fix a problem, rebinding it to another meaning introduces new problems. We have plenty of other letters in the alphabet to use.

It's actually being used in the same sense we already use it - I'm just adding a new compile time use case where the distinction matters again, which we haven't previously had in Python 3. (The usage in this PEP is fairly closely analogous to WSGI's distinction between native strings, text strings and binary strings, which matters for hybrid Python 2/3 code, but not for pure Python 3 code) It would certainly be *possible* to use a different character for that aspect of the PEP, but it would be additional work without any obvious gain. Cheers, Nick. P.S. I hop on the plane for the US in a few hours, so I'll be aiming to be bad at responding to emails until the 17th or so. We'll see how well I stick to that plan :) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Alexander Walters

8:12 a.m.

... Its adding meaning to something that was intentionally meaningless. Not using u'' has the obvious, immediate benefit of not caring what u'' means in python 3, so one can continue to write polyglot code. Since you are adding new semantics to python 3, use a different letter so that it just breaks in python 2, instead of having different meanings between versions. Python 2 is still the dominant python. On 8/8/2015 11:07, Nick Coghlan wrote:

...

On 9 August 2015 at 00:05, Alexander Walters <tritium-list@sdamon.com> wrote:

...
Please do not change the meaning of the vestigial U''. It was re-added to the language to fix a problem, rebinding it to another meaning introduces new problems. We have plenty of other letters in the alphabet to use. It's actually being used in the same sense we already use it - I'm just adding a new compile time use case where the distinction matters again, which we haven't previously had in Python 3. (The usage in this PEP is fairly closely analogous to WSGI's distinction between native strings, text strings and binary strings, which matters for hybrid Python 2/3 code, but not for pure Python 3 code)

It would certainly be *possible* to use a different character for that aspect of the PEP, but it would be additional work without any obvious gain.

Cheers, Nick.

P.S. I hop on the plane for the US in a few hours, so I'll be aiming to be bad at responding to emails until the 17th or so. We'll see how well I stick to that plan :)

Alexander Walters

8:16 a.m.

Wait a second, the pep itself does not use the vestigial u''... it uses i''. where did u'' come from? On 8/8/2015 11:07, Nick Coghlan wrote:

...

On 9 August 2015 at 00:05, Alexander Walters <tritium-list@sdamon.com> wrote:

...
Please do not change the meaning of the vestigial U''. It was re-added to the language to fix a problem, rebinding it to another meaning introduces new problems. We have plenty of other letters in the alphabet to use. It's actually being used in the same sense we already use it - I'm just adding a new compile time use case where the distinction matters again, which we haven't previously had in Python 3. (The usage in this PEP is fairly closely analogous to WSGI's distinction between native strings, text strings and binary strings, which matters for hybrid Python 2/3 code, but not for pure Python 3 code)

It would certainly be *possible* to use a different character for that aspect of the PEP, but it would be additional work without any obvious gain.

Cheers, Nick.

P.S. I hop on the plane for the US in a few hours, so I'll be aiming to be bad at responding to emails until the 17th or so. We'll see how well I stick to that plan :)

Nick Coghlan

8:24 a.m.

On 9 August 2015 at 01:16, Alexander Walters <tritium-list@sdamon.com> wrote:

...

Wait a second, the pep itself does not use the vestigial u''... it uses i''. where did u'' come from?

The only difference in the PEP is the fact that the iu"" variant calls a different builtin (__interpolateu__ instead of __interpolate__). There's no change to the semantics of u"" - those remain identical to "". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Alexander Walters

8:23 a.m.

As written in the pep, where i'' means 'I have the __interpolate__' method, and iu'' means 'i have the __interpolateu__' method (or that translators should call these methods), is fine, as the meaning of u ('I am unicode, yeah you already knew that') isn't changed. On 8/8/2015 11:07, Nick Coghlan wrote:

...

On 9 August 2015 at 00:05, Alexander Walters <tritium-list@sdamon.com> wrote:

...
Please do not change the meaning of the vestigial U''. It was re-added to the language to fix a problem, rebinding it to another meaning introduces new problems. We have plenty of other letters in the alphabet to use. It's actually being used in the same sense we already use it - I'm just adding a new compile time use case where the distinction matters again, which we haven't previously had in Python 3. (The usage in this PEP is fairly closely analogous to WSGI's distinction between native strings, text strings and binary strings, which matters for hybrid Python 2/3 code, but not for pure Python 3 code)

It would certainly be *possible* to use a different character for that aspect of the PEP, but it would be additional work without any obvious gain.

Cheers, Nick.

P.S. I hop on the plane for the US in a few hours, so I'll be aiming to be bad at responding to emails until the 17th or so. We'll see how well I stick to that plan :)

Brett Cannon

1:49 p.m.

Can the discussion of PEP 501 be done in a separate thread? As of right now this thread has not been about PEP 498 beyond Eric's initial email. On Sat, Aug 8, 2015 at 8:56 AM Alexander Walters <tritium-list@sdamon.com> wrote:

...

As written in the pep, where i'' means 'I have the __interpolate__' method, and iu'' means 'i have the __interpolateu__' method (or that translators should call these methods), is fine, as the meaning of u ('I am unicode, yeah you already knew that') isn't changed.

On 8/8/2015 11:07, Nick Coghlan wrote:

...
On 9 August 2015 at 00:05, Alexander Walters <tritium-list@sdamon.com> wrote:

...
Please do not change the meaning of the vestigial U''. It was re-added to the language to fix a problem, rebinding it to another meaning introduces new problems. We have plenty of other letters in the alphabet to use. It's actually being used in the same sense we already use it - I'm just adding a new compile time use case where the distinction matters again, which we haven't previously had in Python 3. (The usage in this PEP is fairly closely analogous to WSGI's distinction between native strings, text strings and binary strings, which matters for hybrid Python 2/3 code, but not for pure Python 3 code)

It would certainly be *possible* to use a different character for that aspect of the PEP, but it would be additional work without any obvious gain.

Cheers, Nick.

P.S. I hop on the plane for the US in a few hours, so I'll be aiming to be bad at responding to emails until the 17th or so. We'll see how well I stick to that plan :)

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

Nick Coghlan

8:10 a.m.

On 8 August 2015 at 19:34, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com> wrote:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

Thanks you for your work on this - it's a very cool concept!

I've also now written and posted an initial draft of PEP 500,

I've actually moved this to PEP 501, for reasons of liking a proposed alternate use of PEP 500 :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nikolaus Rath

7:28 p.m.

On Aug 08 2015, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com> wrote:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

Thanks you for your work on this - it's a very cool concept!

I've also now written and posted an initial draft of PEP 500, [...]

I think what that PEP really needs is a concise summary of the *differences* to PEP 498. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

Nikolaus Rath

7:37 p.m.

On Aug 08 2015, Nikolaus Rath <Nikolaus@rath.org> wrote:

...

On Aug 08 2015, Nick Coghlan <ncoghlan@gmail.com> wrote:

...
On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com> wrote:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

Thanks you for your work on this - it's a very cool concept!

I've also now written and posted an initial draft of PEP 500, [...]

I think what that PEP really needs is a concise summary of the *differences* to PEP 498.

I should probably elaborate on that. After reading both PEPs, it seems to me that the only difference is that you want to use a different prefix (i instead of f), use ${} instead of {}, and call a builtin function to perform the interpolation (instead of always using format). But is that really it? The PEP appears rather long, so I'm not sure if I'm missing other differences in the parts that seemed identical to PEP 498 to me. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

Brett Cannon

2:05 p.m.

On Fri, Aug 7, 2015 at 6:39 PM Eric V. Smith <eric@trueblade.com> wrote:

...

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

I fixed a grammar nit directly in the PEP, but otherwise I'm +1 on the proposal.

Tim Delaney

6:08 p.m.

On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com> wrote:

...

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

I'd like to see an alternatives section, in particular listing alternative prefixes and why they weren't chosen over f. Off the top of my head, ones I've seen listed are: ! $ Tim Delaney

Eric V. Smith

9 Aug 9 Aug

10:22 a.m.

On 8/8/2015 9:08 PM, Tim Delaney wrote:

...

On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com <mailto:eric@trueblade.com>> wrote:

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

>>> def how_awesome(): return 'very' ... >>> f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

I'd like to see an alternatives section, in particular listing alternative prefixes and why they weren't chosen over f. Off the top of my head, ones I've seen listed are:

! $

I'll add something, but there's no particular reason. "f" for formatted, along the lines of 'r' raw, 'b' bytes, and 'u' unicode. Especially when you want to combine them, I think a letter looks better: fr'{x} a formatted raw string' $r'{x} a formatted raw string' Eric.

Sven R. Kunze

10:29 p.m.

After I read Nick's proposal and pondering over the 'f' vs. 'r' examples, I like the 'i' prefix more (regardless of the internal implementation). The best solution would be "without prefix and '{var}' only" syntax. Not sure if that is possible at all; I cannot remember using '{...}' anywhere else than for formatting. On 09.08.2015 19:22, Eric V. Smith wrote:

...

On 8/8/2015 9:08 PM, Tim Delaney wrote:

...
On 8 August 2015 at 11:39, Eric V. Smith <eric@trueblade.com <mailto:eric@trueblade.com>> wrote:

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

>>> def how_awesome(): return 'very' ... >>> f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

I'd like to see an alternatives section, in particular listing alternative prefixes and why they weren't chosen over f. Off the top of my head, ones I've seen listed are:

! $ I'll add something, but there's no particular reason. "f" for formatted, along the lines of 'r' raw, 'b' bytes, and 'u' unicode.

Especially when you want to combine them, I think a letter looks better: fr'{x} a formatted raw string' $r'{x} a formatted raw string'

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de

Alexander Walters

10 Aug 10 Aug

3:05 a.m.

On 8/10/2015 01:29, Sven R. Kunze wrote:

...

The best solution would be "without prefix and '{var}' only" syntax. Not sure if that is possible at all; I cannot remember using '{...}' anywhere else than for formatting.

My JSON string literal 'test fixtures' weep at that idea.

Raymond Hettinger

8 Aug 8 Aug

6:19 p.m.

...

On Aug 7, 2015, at 6:39 PM, Eric V. Smith <eric@trueblade.com> wrote:

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

Here's are few thoughts: * I really like the reduction in verbosity for passing in the variable names. * Because of my C background, I experience a little mental hiccup when using the f-prefix with the print() function: print(f"The answer is {answer}") wants to come out of my fingers as: printf("The answer is {answer}") * It's unclear whether the string-to-expression-expansion should be arbitrarily limited to locals() and globals() or whether it should include __builtins__ and cell variables (closures and nested scopes). Making it behave just like normal expressions means that there won't be new special cases to remember and that many existing calls to format() can be converted automatically: w = 10 def f(x): def g(y): print(f'{len.__name__}{w}{x}{y}') * Will this proposal complicate linters, analysis tools, highlighters, etc.? In a way, this isn't a small language extension, it is a whole new way to write expressions. * Does it complicate situations where we would otherwise pass around templates as first class class objects (internationalization for example)? def welcome(name, title): print(_("Good morning {title} {name}")) # expect gettext() substitution * A related thought is that we normally like templates to live outside the functions where they are used (separation of business logic and presentation logic). Use of f-strings may impact our ability to refactor (move code up or down a chain of nested function calls), ability to pass in templates as arguments, storing templates in globals or thread locals so that they are shareable, or moving them out of our scripts and into files editable by non-programmers. * With respect to learnability, the downside is that it becomes yet another thing to have to cover in a Python class (I'm already not looking forward teaching star-unpacking generalizations and the restraint to not overuse them, and covering await, and single dispatch, etc, etc). The upside is that templates themselves aren't being changed. The only incremental learning task is the invocation becomes automatic, saving us a little typing. The above above are random thoughts based a first quick read. Don't take them too seriously. Some are just shooting from the hip and are listed as food for thought. Raymond

Stefan Behnel

9 Aug 9 Aug

1:06 a.m.

Eric V. Smith schrieb am 08.08.2015 um 03:39:

...

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here] How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern: result = process("string with {a} and {b}").format(a=1, b=2) which commonly collapses into result = translate("string with {a} and {b}", a=1, b=2) by wrapping the concrete use cases in appropriate helper functions. I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem. Stefan

Stefan Behnel

2:53 a.m.

Stefan Behnel schrieb am 09.08.2015 um 10:06:

...

Eric V. Smith schrieb am 08.08.2015 um 03:39:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either.

Thinking about this some more, the "almost all" is actually wrong. This only applies to one kind of application that I'm working on. In fact, "almost all" of the string formatting that I use is not in those applications but in Cython's code generator. And there's a *lot* of string formatting in there, even though we use real templating for bigger things already. However, looking through the code, I cannot see this proposal being of much help for that use case either. Many of the values that get formatted into the strings use some kind of non-trivial expression (function calls, object attributes, also local variables, sometimes variables with lengthy names) that is best written out in actual code. Here are some real example snippets: code.putln( 'static char %s[] = "%s";' % ( entry.doc_cname, split_string_literal(escape_byte_string(docstr)))) if entry.is_special: code.putln('#if CYTHON_COMPILING_IN_CPYTHON') code.putln( "struct wrapperbase %s;" % entry.wrapperbase_cname) code.putln('#endif') temp = ... code.putln("for (%s=0; %s < PyTuple_GET_SIZE(%s); %s++) {" % ( temp, temp, Naming.args_cname, temp)) code.putln("PyObject* item = PyTuple_GET_ITEM(%s, %s);" % ( Naming.args_cname, temp)) code.put("%s = (%s) ? PyDict_Copy(%s) : PyDict_New(); " % ( self.starstar_arg.entry.cname, Naming.kwds_cname, Naming.kwds_cname)) code.putln("if (unlikely(!%s)) return %s;" % ( self.starstar_arg.entry.cname, self.error_value())) We use %-formatting for historical reasons (that's all there was 15 years ago), but I wouldn't switch to .format() because there is nothing to win here. The "%s" etc. place holders are *very* short and do not get in the way (as "{}" would in C code templates). Named formatting would require a lot more space in the templates, so positional, unnamed formatting helps readability a lot. And the value expressions used for the interpolation tend to be expressions rather than simple variables, so keeping those outside of the formatting strings simplifies both editing and reading. That's the third major real-world use case for string formatting now where this proposal doesn't help. The niche is getting smaller. Stefan

ISAAC J SCHWABACHER

10 Aug 10 Aug

4:05 p.m.

I don't know about you, but I sure like this better than what you have: code.putlines(f""" static char {entry.doc_cname}[] = "{ split_string_literal(escape_bytestring(docstr))}"; { # nested! f""" #if CYTHON_COMPILING_IN_CPYTHON struct wrapperbase {entry.wrapperbase_cname}; #endif """ if entry.is_special else ''} {(lambda temp, argn: # my kingdom for a let! f""" for ({temp}=0; {temp}<PyTuple_GET_SIZE({argn}); {temp}++) {{ PyObject *item = PyTuple_GET_ITEM({argn}, {temp}); }}""")(..., Naming.args_cname)} {self.starstar_arg.entry.cname} = ({Naming.kwds_cname}) ? PyDict_Copy({Naming.kwds_cname}) : PyDict_New(); if (unlikely(!{self.starstar_arg.entry.cname})) return {self.error_value()}; """) What do others think of this PEP-498 sample? (The PEP-501 version looks pretty similar, so I omit it.) ijs P.S.: Would it make sense to just treat the contents of an interpolation cell as being in parentheses? This would allow leading whitespace without special cases. Top-posted from Microsoft Outlook Web App; may its designers be consigned for eternity to that circle of hell in which their dog food is consumed. ________________________________________ From: Python-Dev <python-dev-bounces+ischwabacher=wisc.edu@python.org> on behalf of Stefan Behnel <stefan_ml@behnel.de> Sent: Sunday, August 9, 2015 04:53 To: python-dev@python.org Subject: Re: [Python-Dev] PEP-498: Literal String Formatting Stefan Behnel schrieb am 09.08.2015 um 10:06:

...

Eric V. Smith schrieb am 08.08.2015 um 03:39:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either.

Stefan Behnel

11:12 p.m.

ISAAC J SCHWABACHER schrieb am 11.08.2015 um 01:05:

...

I don't know about you, but I sure like this better than what you have:

code.putlines(f""" static char {entry.doc_cname}[] = "{ split_string_literal(escape_bytestring(docstr))}";

{ # nested! f""" #if CYTHON_COMPILING_IN_CPYTHON struct wrapperbase {entry.wrapperbase_cname}; #endif """ if entry.is_special else ''}

{(lambda temp, argn: # my kingdom for a let! f""" for ({temp}=0; {temp}<PyTuple_GET_SIZE({argn}); {temp}++) {{ PyObject *item = PyTuple_GET_ITEM({argn}, {temp}); }}""")(..., Naming.args_cname)}

{self.starstar_arg.entry.cname} = ({Naming.kwds_cname}) ? PyDict_Copy({Naming.kwds_cname}) : PyDict_New();

if (unlikely(!{self.starstar_arg.entry.cname})) return {self.error_value()}; """)

Matter of taste, I guess. Looks awful to me. It's very difficult to visually separate input and output in this code, so it requires a thorough look to see what data is being used for the formatting. Syntax highlighting and in-string expression completion should eventually help, once IDEs support it. But then editing this code will require an editor that has such support. And not everyone is going to be willing to get one. Stefan

Greg Ewing

11 Aug 11 Aug

4:34 p.m.

Stefan Behnel wrote:

...

Syntax highlighting and in-string expression completion should eventually help, once IDEs support it.

ISAAC J SCHWABACHER

12 Aug 12 Aug

12:46 p.m.

Ruby already has this feature, and in my experience syntax highlighters handle it just fine. Here's what vim's default highlighter shows me: puts "we can #{ ["include", "interpolate"].each { |s| puts s } .select { |s| s.include? "erp" } # .first } arbitrary expressions!" So an editor whose syntax highlighting is based on regular expressions already can't cope with the world as it is. :) Does anyone reading this know of a tool that successfully highlights python but not ruby? ijs ________________________________________ From: Python-Dev <python-dev-bounces+ischwabacher=wisc.edu@python.org> on behalf of Greg Ewing <greg.ewing@canterbury.ac.nz> Sent: Tuesday, August 11, 2015 18:34 To: python-dev@python.org Subject: Re: [Python-Dev] PEP-498: Literal String Formatting Stefan Behnel wrote:

...

Syntax highlighting and in-string expression completion should eventually help, once IDEs support it.

Concerning that, this is going to place quite a burden on syntax highlighters. Doing it properly will require the ability to parse arbitrary Python expressions, or at least match nested brackets. An editor whose syntax hightlighting engine is based on regular expressions could have trouble with that. -- Greg _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ischwabacher%40wisc.edu

Barry Warsaw

11 Aug 11 Aug

6:44 a.m.

On Aug 10, 2015, at 11:05 PM, ISAAC J SCHWABACHER wrote:

...

code.putlines(f""" static char {entry.doc_cname}[] = "{ split_string_literal(escape_bytestring(docstr))}";

{ # nested! f""" #if CYTHON_COMPILING_IN_CPYTHON struct wrapperbase {entry.wrapperbase_cname}; #endif """ if entry.is_special else ''}

{(lambda temp, argn: # my kingdom for a let! f""" for ({temp}=0; {temp}<PyTuple_GET_SIZE({argn}); {temp}++) {{ PyObject *item = PyTuple_GET_ITEM({argn}, {temp}); }}""")(..., Naming.args_cname)}

{self.starstar_arg.entry.cname} = ({Naming.kwds_cname}) ? PyDict_Copy({Naming.kwds_cname}) : PyDict_New();

if (unlikely(!{self.starstar_arg.entry.cname})) return {self.error_value()}; """)

What do others think of this PEP-498 sample?

No offense intended, but I put this in an Emacs Python buffer and it made me want to cry. Cheers, -Barry

ISAAC J SCHWABACHER

9:10 a.m.

Now with syntax highlighting, if my email client cooperates: code.putlines(f""" static char {entry.doc_cname}[] = "{ split_string_literal(escape_bytestring(docstr))}"; { # nested! f""" #if CYTHON_COMPILING_IN_CPYTHON struct wrapperbase {entry.wrapperbase_cname}; #endif """ if entry.is_special else ''} {(lambda temp, argn: # my kingdom for a let! f""" for ({temp}=0; {temp}<PyTuple_GET_SIZE({argn}); {temp}++) {{ PyObject *item = PyTuple_GET_ITEM({argn}, {temp}); }}""")(..., Naming.args_cname)} {self.starstar_arg.entry.cname} = ({Naming.kwds_cname}) ? PyDict_Copy({Naming.kwds_cname}) : PyDict_New(); if (unlikely(!{self.starstar_arg.entry.cname})) return {self.error_value()}; """) Better? ijs Top-posted from Microsoft Outlook Web App; may its designers be consigned for eternity to that circle of hell in which their dog food is consumed. ________________________________________ From: Python-Dev <python-dev-bounces+ischwabacher=wisc.edu@python.org> on behalf of Stefan Behnel <stefan_ml@behnel.de> Sent: Sunday, August 9, 2015 04:53 To: python-dev@python.org Subject: Re: [Python-Dev] PEP-498: Literal String Formatting Stefan Behnel schrieb am 09.08.2015 um 10:06:

...

Eric V. Smith schrieb am 08.08.2015 um 03:39:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either.

Ethan Furman

12 Aug 12 Aug

4:11 p.m.

On 08/10/2015 04:05 PM, ISAAC J SCHWABACHER wrote:

...

I don't know about you, but I sure like this better than what you have:

code.putlines(f""" static char {entry.doc_cname}[] = "{ split_string_literal(escape_bytestring(docstr))}";

{ # nested! f""" #if CYTHON_COMPILING_IN_CPYTHON struct wrapperbase {entry.wrapperbase_cname}; #endif """ if entry.is_special else ''}

{(lambda temp, argn: # my kingdom for a let! f""" for ({temp}=0; {temp}<PyTuple_GET_SIZE({argn}); {temp}++) {{ PyObject *item = PyTuple_GET_ITEM({argn}, {temp}); }}""")(..., Naming.args_cname)}

{self.starstar_arg.entry.cname} = ({Naming.kwds_cname}) ? PyDict_Copy({Naming.kwds_cname}) : PyDict_New();

if (unlikely(!{self.starstar_arg.entry.cname})) return {self.error_value()}; """)

What do others think of this PEP-498 sample? (The PEP-501 version looks pretty similar, so I omit it.)

Agh! My brain is hurting! ;) No, I don't care for it at all. -- ~Ethan~

ISAAC J SCHWABACHER

13 Aug 13 Aug

2:57 p.m.

Well, I seem to have succeeded in crystallizing opinions on the topic, even if the consensus is, "Augh! Make it stop!" :) The primary objective of that code sample was to make the structure of the code as close as possible to the structure of the interpolated string, since having descriptive text like "{entry.doc_cname}" inline instead of "%s" is precisely what str.format gains over str.__mod__. But there are several different elements in that code, and I'm curious what people find most off-putting. Is it the triple quoted format strings? The nesting? The interpolation with `"""...""" if cond else ''`? Just plain interpolations as are already available with str.format, but without explicitly importing names into the format string's scope via **kwargs? Trying to emulate let? Would a different indentation scheme make things better, or is this a problem with the coding style I've advanced here, or with the feature itself? Also, should this be allowed: def make_frob(foo): def frob(bar): f"""Frob the bar using {foo}""" ? ijs P.S.: I've translated the original snippet into ruby here: https://gist.github.com/ischwabacher/405afb86e28282946cc5, since it's already legal syntax there. Ironically, github's syntax highlighting either fails to parse the interpolation (in edit mode) or fails to treat the heredoc as a string literal (in display mode), but you can open it in your favorite editor to see whether the highlighting makes the code clearer. ________________________________________ From: Python-Dev <python-dev-bounces+ischwabacher=wisc.edu@python.org> on behalf of Ethan Furman <ethan@stoneleaf.us> Sent: Wednesday, August 12, 2015 18:11 To: python-dev@python.org Subject: Re: [Python-Dev] PEP-498: Literal String Formatting On 08/10/2015 04:05 PM, ISAAC J SCHWABACHER wrote:

...

I don't know about you, but I sure like this better than what you have:

code.putlines(f""" static char {entry.doc_cname}[] = "{ split_string_literal(escape_bytestring(docstr))}";

{ # nested! f""" #if CYTHON_COMPILING_IN_CPYTHON struct wrapperbase {entry.wrapperbase_cname}; #endif """ if entry.is_special else ''}

{(lambda temp, argn: # my kingdom for a let! f""" for ({temp}=0; {temp}<PyTuple_GET_SIZE({argn}); {temp}++) {{ PyObject *item = PyTuple_GET_ITEM({argn}, {temp}); }}""")(..., Naming.args_cname)}

{self.starstar_arg.entry.cname} = ({Naming.kwds_cname}) ? PyDict_Copy({Naming.kwds_cname}) : PyDict_New();

if (unlikely(!{self.starstar_arg.entry.cname})) return {self.error_value()}; """)

What do others think of this PEP-498 sample? (The PEP-501 version looks pretty similar, so I omit it.)

Agh! My brain is hurting! ;) No, I don't care for it at all. -- ~Ethan~ _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ischwabacher%40wisc.edu

Brett Cannon

9 Aug 9 Aug

10:38 a.m.

On Sun, 9 Aug 2015 at 01:07 Stefan Behnel <stefan_ml@behnel.de> wrote:

...

Eric V. Smith schrieb am 08.08.2015 um 03:39:

...
Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern:

result = process("string with {a} and {b}").format(a=1, b=2)

which commonly collapses into

result = translate("string with {a} and {b}", a=1, b=2)

by wrapping the concrete use cases in appropriate helper functions.

I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem.

So in my case the vast majority of calls to str.format could be replaced with an f-string. I would also like to believe that other languages that have adopted this approach to string interpolation did so with knowledge that it would be worth it (but then again I don't really know how other languages are developed so this might just be a hope that other languages fret as much as we do about stuff).

Eric V. Smith

11:22 a.m.

On 8/9/2015 1:38 PM, Brett Cannon wrote:

...

On Sun, 9 Aug 2015 at 01:07 Stefan Behnel <stefan_ml@behnel.de <mailto:stefan_ml@behnel.de>> wrote:

Eric V. Smith schrieb am 08.08.2015 um 03:39: > Following a long discussion on python-ideas, I've posted my draft of > PEP-498. It describes the "f-string" approach that was the subject of > the "Briefer string format" thread. I'm open to a better title than > "Literal String Formatting". > > I need to add some text to the discussion section, but I think it's in > reasonable shape. I have a fully working implementation that I'll get > around to posting somewhere this weekend. > > >>> def how_awesome(): return 'very' > ... > >>> f'f-strings are {how_awesome()} awesome!' > 'f-strings are very awesome!' > > I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern:

result = process("string with {a} and {b}").format(a=1, b=2)

which commonly collapses into

result = translate("string with {a} and {b}", a=1, b=2)

by wrapping the concrete use cases in appropriate helper functions.

I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem.

So in my case the vast majority of calls to str.format could be replaced with an f-string. I would also like to believe that other languages that have adopted this approach to string interpolation did so with knowledge that it would be worth it (but then again I don't really know how other languages are developed so this might just be a hope that other languages fret as much as we do about stuff).

Peter Ludemann

12:58 p.m.

Most of my outputs are log messages, so this proposal won't help me because (I presume) it does eager evaluation of the format string and the logging methods are designed to do lazy evaluation. Python doesn't have anything like Lisp's "special forms", so there doesn't seem to be a way to implicitly put a lambda on the string to delay evaluation. It would be nice to be able to mark the formatting as lazy ... maybe another string prefix character to indicate that? (And would the 2nd expression in an assert statement be lazy or eager?) PS: As to Brett's comment about the history of string interpolation ... my recollection/understanding is that it started with Unix shells and the "$variable" notation, with the "$variable" being evaluated within "..." and not within '...'. Perl, PHP, Make (and others) picked this up. There seems to be a trend to avoid the bare "$variable" form and instead use "${variable}" everywhere, mainly because "${...}" is sometimes required to avoid ambiguities (e.g. "There were $NUMBER ${THING}s.") PPS: For anyone wishing to improve the existing format options, Common Lisp's FORMAT <http://www.gigamonkeys.com/book/a-few-format-recipes.html> and Prolog's format/2 <https://quintus.sics.se/isl/quintus/html/quintus/mpg-ref-format.html> have some capabilities that I miss from time to time in Python. On 9 August 2015 at 11:22, Eric V. Smith <eric@trueblade.com> wrote:

...

On 8/9/2015 1:38 PM, Brett Cannon wrote:

...
On Sun, 9 Aug 2015 at 01:07 Stefan Behnel <stefan_ml@behnel.de <mailto:stefan_ml@behnel.de>> wrote:

Eric V. Smith schrieb am 08.08.2015 um 03:39: > Following a long discussion on python-ideas, I've posted my draft

of

...
> PEP-498. It describes the "f-string" approach that was the subject

of

...
> the "Briefer string format" thread. I'm open to a better title than > "Literal String Formatting". > > I need to add some text to the discussion section, but I think

it's in

...
> reasonable shape. I have a fully working implementation that I'll

get

...
> around to posting somewhere this weekend. > > >>> def how_awesome(): return 'very' > ... > >>> f'f-strings are {how_awesome()} awesome!' > 'f-strings are very awesome!' > > I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string

formatting

...
that I've used lately is either for logging (no help from this

proposal

...
here) or requires some kind of translation/i18n *before* the

formatting,

...
which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern:

result = process("string with {a} and {b}").format(a=1, b=2)

which commonly collapses into

result = translate("string with {a} and {b}", a=1, b=2)

by wrapping the concrete use cases in appropriate helper functions.

I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for

a

...
niche problem.

So in my case the vast majority of calls to str.format could be replaced with an f-string. I would also like to believe that other languages that have adopted this approach to string interpolation did so with knowledge that it would be worth it (but then again I don't really know how other languages are developed so this might just be a hope that other languages fret as much as we do about stuff).

I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an improvement.

I firmly believe that the majority of software written in Python does not show up on PyPi, but is used internally in corporations. It's not internationalized or localized: it just exists to get a job done quickly. This is the code that would benefit from f-strings.

This isn't to say that there's not plenty of code where f-strings would not help. But I think it's as big a mistake to generalize from my experience as it is from Stefan's.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com

Brett Cannon

3:25 p.m.

On Sun, Aug 9, 2015, 13:51 Peter Ludemann via Python-Dev < python-dev@python.org> wrote: Most of my outputs are log messages, so this proposal won't help me because (I presume) it does eager evaluation of the format string and the logging methods are designed to do lazy evaluation. Python doesn't have anything like Lisp's "special forms", so there doesn't seem to be a way to implicitly put a lambda on the string to delay evaluation. It would be nice to be able to mark the formatting as lazy ... maybe another string prefix character to indicate that? (And would the 2nd expression in an assert statement be lazy or eager?) That would require a lazy string type which is beyond the scope of this PEP as proposed since it would require its own design choices, how much code would not like the different type, etc. -Brett PS: As to Brett's comment about the history of string interpolation ... my recollection/understanding is that it started with Unix shells and the "$variable" notation, with the "$variable" being evaluated within "..." and not within '...'. Perl, PHP, Make (and others) picked this up. There seems to be a trend to avoid the bare "$variable" form and instead use "${variable}" everywhere, mainly because "${...}" is sometimes required to avoid ambiguities (e.g. "There were $NUMBER ${THING}s.") PPS: For anyone wishing to improve the existing format options, Common Lisp's FORMAT <http://www.gigamonkeys.com/book/a-few-format-recipes.html> and Prolog's format/2 <https://quintus.sics.se/isl/quintus/html/quintus/mpg-ref-format.html> have some capabilities that I miss from time to time in Python. On 9 August 2015 at 11:22, Eric V. Smith <eric@trueblade.com> wrote: On 8/9/2015 1:38 PM, Brett Cannon wrote:

...

On Sun, 9 Aug 2015 at 01:07 Stefan Behnel <stefan_ml@behnel.de

...

<mailto:stefan_ml@behnel.de>> wrote:

Eric V. Smith schrieb am 08.08.2015 um 03:39: > Following a long discussion on python-ideas, I've posted my draft of > PEP-498. It describes the "f-string" approach that was the subject of > the "Briefer string format" thread. I'm open to a better title than > "Literal String Formatting". > > I need to add some text to the discussion section, but I think it's in > reasonable shape. I have a fully working implementation that I'll get > around to posting somewhere this weekend. > > >>> def how_awesome(): return 'very' > ... > >>> f'f-strings are {how_awesome()} awesome!' > 'f-strings are very awesome!' > > I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string

formatting

...

that I've used lately is either for logging (no help from this

proposal

...

here) or requires some kind of translation/i18n *before* the

formatting,

...

which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern:

result = process("string with {a} and {b}").format(a=1, b=2)

which commonly collapses into

result = translate("string with {a} and {b}", a=1, b=2)

by wrapping the concrete use cases in appropriate helper functions.

I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem.

So in my case the vast majority of calls to str.format could be replaced with an f-string. I would also like to believe that other languages that have adopted this approach to string interpolation did so with knowledge that it would be worth it (but then again I don't really know how other languages are developed so this might just be a hope that other languages fret as much as we do about stuff).

I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an improvement. I firmly believe that the majority of software written in Python does not show up on PyPi, but is used internally in corporations. It's not internationalized or localized: it just exists to get a job done quickly. This is the code that would benefit from f-strings. This isn't to say that there's not plenty of code where f-strings would not help. But I think it's as big a mistake to generalize from my experience as it is from Stefan's. Eric. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

Peter Ludemann

5:24 p.m.

What if logging understood lambda? (By testing for types.FunctionType). This is outside PEP 498, but there might be some recommendations on how "lazy" evaluation should be done and understood by some functions. e.g.: log.info(lambda: f'{foo} just did a {bar} thing') It's not pretty, but it's not too verbose. As far as I can tell, PEP 498 would work with this because it implicitly supports closures — that is, it's defined as equivalent to log.info(lambda: ''.join([foo.__format__(), ' just did a ', bar.__format__(), ' thing'])) On 9 August 2015 at 15:25, Brett Cannon <brett@python.org> wrote:

...

On Sun, Aug 9, 2015, 13:51 Peter Ludemann via Python-Dev < python-dev@python.org> wrote:

Most of my outputs are log messages, so this proposal won't help me because (I presume) it does eager evaluation of the format string and the logging methods are designed to do lazy evaluation. Python doesn't have anything like Lisp's "special forms", so there doesn't seem to be a way to implicitly put a lambda on the string to delay evaluation.

It would be nice to be able to mark the formatting as lazy ... maybe another string prefix character to indicate that? (And would the 2nd expression in an assert statement be lazy or eager?)

That would require a lazy string type which is beyond the scope of this PEP as proposed since it would require its own design choices, how much code would not like the different type, etc.

-Brett

PS: As to Brett's comment about the history of string interpolation ... my recollection/understanding is that it started with Unix shells and the "$variable" notation, with the "$variable" being evaluated within "..." and not within '...'. Perl, PHP, Make (and others) picked this up. There seems to be a trend to avoid the bare "$variable" form and instead use "${variable}" everywhere, mainly because "${...}" is sometimes required to avoid ambiguities (e.g. "There were $NUMBER ${THING}s.")

PPS: For anyone wishing to improve the existing format options, Common Lisp's FORMAT <http://www.gigamonkeys.com/book/a-few-format-recipes.html> and Prolog's format/2 <https://quintus.sics.se/isl/quintus/html/quintus/mpg-ref-format.html> have some capabilities that I miss from time to time in Python.

On 9 August 2015 at 11:22, Eric V. Smith <eric@trueblade.com> wrote:

On 8/9/2015 1:38 PM, Brett Cannon wrote:

...
On Sun, 9 Aug 2015 at 01:07 Stefan Behnel <stefan_ml@behnel.de

...
<mailto:stefan_ml@behnel.de>> wrote:

Eric V. Smith schrieb am 08.08.2015 um 03:39: > Following a long discussion on python-ideas, I've posted my draft of > PEP-498. It describes the "f-string" approach that was the subject of > the "Briefer string format" thread. I'm open to a better title than > "Literal String Formatting". > > I need to add some text to the discussion section, but I think it's in > reasonable shape. I have a fully working implementation that I'll get > around to posting somewhere this weekend. > > >>> def how_awesome(): return 'very' > ... > >>> f'f-strings are {how_awesome()} awesome!' > 'f-strings are very awesome!' > > I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern:

result = process("string with {a} and {b}").format(a=1, b=2)

which commonly collapses into

result = translate("string with {a} and {b}", a=1, b=2)

by wrapping the concrete use cases in appropriate helper functions.

I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem.

So in my case the vast majority of calls to str.format could be replaced with an f-string. I would also like to believe that other languages that have adopted this approach to string interpolation did so with knowledge that it would be worth it (but then again I don't really know how other languages are developed so this might just be a hope that other languages fret as much as we do about stuff).

I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an improvement.

I firmly believe that the majority of software written in Python does not show up on PyPi, but is used internally in corporations. It's not internationalized or localized: it just exists to get a job done quickly. This is the code that would benefit from f-strings.

This isn't to say that there's not plenty of code where f-strings would not help. But I think it's as big a mistake to generalize from my experience as it is from Stefan's.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

MRAB

5:33 p.m.

On 2015-08-10 01:24, Peter Ludemann via Python-Dev wrote:

...

What if logging understood lambda? (By testing for types.FunctionType). [snip]

Why not use 'callable'?

Eric V. Smith

6:02 p.m.

On 8/9/2015 8:24 PM, Peter Ludemann wrote:

...

What if logging understood lambda? (By testing for types.FunctionType). This is outside PEP 498, but there might be some recommendations on how "lazy" evaluation should be done and understood by some functions.

e.g.: log.info <http://log.info>(lambda: f'{foo} just did a {bar} thing')

It's not pretty, but it's not too verbose. As far as I can tell, PEP 498 would work with this because it implicitly supports closures — that is, it's defined as equivalent to log.info <http://log.info>(lambda: ''.join([foo.__format__(), ' just did a ', bar.__format__(), ' thing']))

That basically works: class Foo: def __init__(self, name): self.name = name def __format__(self, fmt): print(f'__format__: {self.name}') return f'{self.name}' class Logger: # accumulate log messages until flush is called def __init__(self): self.values = [] def log(self, value): self.values.append(value) def flush(self): for value in self.values: if callable(value): value = value() print(f'log: {value}') logger = Logger() f1 = Foo('one') f2 = Foo('two') print('before log calls') logger.log('first log message') logger.log(lambda:f'f: {f1} {f2}') logger.log('last log message') print('after log calls') f1 = Foo('three') logger.flush() produces: before log calls after log calls log: first log message __format__: three __format__: two log: f: three two log: last log message But note that when the lambdas are called, f1 is bound to Foo('three'), so that's what's printed. I don't think that's what the logging module would normally do, since it wouldn't see the rebinding. I guess you'd have to change logging to do something special if it had a single argument which is a callable, or add new interface to it. And of course you'd have to live with the ugliness of lambdas in the logging calls. So, I can't say I'm a huge fan of the approach. But writing examples using f-strings is way more fun that using %-formatting or str.format! But it does remind me I still need to implement f'{field:{width}}'. Eric.

Eric V. Smith

6:05 p.m.

On 8/9/2015 9:02 PM, Eric V. Smith wrote:

...

On 8/9/2015 8:24 PM, Peter Ludemann wrote:

...
What if logging understood lambda? (By testing for types.FunctionType). This is outside PEP 498, but there might be some recommendations on how "lazy" evaluation should be done and understood by some functions.

e.g.: log.info <http://log.info>(lambda: f'{foo} just did a {bar} thing')

It's not pretty, but it's not too verbose. As far as I can tell, PEP 498 would work with this because it implicitly supports closures — that is, it's defined as equivalent to log.info <http://log.info>(lambda: ''.join([foo.__format__(), ' just did a ', bar.__format__(), ' thing']))

That basically works: class Foo: def __init__(self, name): self.name = name

def __format__(self, fmt): print(f'__format__: {self.name}') return f'{self.name}'

class Logger: # accumulate log messages until flush is called def __init__(self): self.values = []

def log(self, value): self.values.append(value)

def flush(self): for value in self.values: if callable(value): value = value() print(f'log: {value}')

logger = Logger()

f1 = Foo('one') f2 = Foo('two') print('before log calls') logger.log('first log message') logger.log(lambda:f'f: {f1} {f2}') logger.log('last log message') print('after log calls') f1 = Foo('three') logger.flush()

produces:

before log calls after log calls log: first log message __format__: three __format__: two log: f: three two log: last log message

But note that when the lambdas are called, f1 is bound to Foo('three'), so that's what's printed. I don't think that's what the logging module would normally do, since it wouldn't see the rebinding.

I guess you'd have to change logging to do something special if it had a single argument which is a callable, or add new interface to it.

And of course you'd have to live with the ugliness of lambdas in the logging calls.

So, I can't say I'm a huge fan of the approach. But writing examples using f-strings is way more fun that using %-formatting or str.format!

Here's a better example that shows the closure. Same output as above: class Foo: def __init__(self, name): self.name = name def __format__(self, fmt): print(f'__format__: {self.name}') return f'{self.name}' class Logger: # accumulate log messages until flush is called def __init__(self): self.values = [] def log(self, value): self.values.append(value) def flush(self): for value in self.values: if callable(value): value = value() print(f'log: {value}') def do_something(logger): f1 = Foo('one') f2 = Foo('two') print('before log calls') logger.log('first log message') logger.log(lambda:f'f: {f1} {f2}') logger.log('last log message') print('after log calls') f1 = Foo('three') logger = Logger() do_something(logger) logger.flush()

Gregory P. Smith

16 Aug 16 Aug

1:04 p.m.

New subject: PEP-498 & PEP-501: Literal String Formatting/Interpolation

On Sun, Aug 9, 2015 at 3:25 PM Brett Cannon <brett@python.org> wrote:

...

On Sun, Aug 9, 2015, 13:51 Peter Ludemann via Python-Dev < python-dev@python.org> wrote:

Most of my outputs are log messages, so this proposal won't help me because (I presume) it does eager evaluation of the format string and the logging methods are designed to do lazy evaluation. Python doesn't have anything like Lisp's "special forms", so there doesn't seem to be a way to implicitly put a lambda on the string to delay evaluation.

It would be nice to be able to mark the formatting as lazy ... maybe another string prefix character to indicate that? (And would the 2nd expression in an assert statement be lazy or eager?)

That would require a lazy string type which is beyond the scope of this PEP as proposed since it would require its own design choices, how much code would not like the different type, etc.

-Brett

Agreed that doesn't belong in PEP 498 or 501 itself... But it is a real need. We left logging behind when we added str.format() and adding yet another _third_ way to do string formatting without addressing the needs of deferred-formatting for things like logging is annoying. brainstorm: Imagine a deferred interpolation string with a d'' prefix.. di'foo ${bar}' would be a new type with a __str__ method that also retains a runtime reference to the necessary values from the scope within which it was created that will be used for substitutions when iff/when it is __str__()ed. I still wouldn't enjoy reminding people to use di'' inlogging.info(di'thing happened: ${result}') all the time any more than I like reminding people to undo their use of % and just pass the values as additional args to the logging call... But I think people would find it friendlier and thus be more likely to get it right on their own. logging's manually deferred % is an idiom i'd like to see wither away. There's also a performance aspect to any new formatter, % is oddly pretty fast, str.format isn't. So long as you can do stuff at compile time rather than runtime I think these PEPs could be even faster. Constant string pep-498 or pep-501 formatting could be broken down at compile time and composed into the optimal set of operations to build the resulting string / call the formatter. So far looking over both peps, I lean towards pep-501 rather than 498: I really prefer the ${} syntax. I don't like arbitrary logical expressions within strings. I dislike str only things without a similar concept for bytes. but neither quite suits me yet. 501's __interpolate*__ builtins are good and bad at the same time. doing this at the module level does seem right, i like the i18n use aspect of that, but you could also imagine these being methods so that subclasses could override the behavior on a per-type basis. but that probably only makes sense if a deferred type is created due to when and how interpolates would be called. also, adding builtins, even __ones__ annoys me for some reason I can't quite put my finger on. (jumping into the threads way late) -gps

...

PS: As to Brett's comment about the history of string interpolation ... my recollection/understanding is that it started with Unix shells and the "$variable" notation, with the "$variable" being evaluated within "..." and not within '...'. Perl, PHP, Make (and others) picked this up. There seems to be a trend to avoid the bare "$variable" form and instead use "${variable}" everywhere, mainly because "${...}" is sometimes required to avoid ambiguities (e.g. "There were $NUMBER ${THING}s.")

PPS: For anyone wishing to improve the existing format options, Common Lisp's FORMAT <http://www.gigamonkeys.com/book/a-few-format-recipes.html> and Prolog's format/2 <https://quintus.sics.se/isl/quintus/html/quintus/mpg-ref-format.html> have some capabilities that I miss from time to time in Python.

On 9 August 2015 at 11:22, Eric V. Smith <eric@trueblade.com> wrote:

On 8/9/2015 1:38 PM, Brett Cannon wrote:

...
On Sun, 9 Aug 2015 at 01:07 Stefan Behnel <stefan_ml@behnel.de

...
<mailto:stefan_ml@behnel.de>> wrote:

Eric V. Smith schrieb am 08.08.2015 um 03:39: > Following a long discussion on python-ideas, I've posted my draft of > PEP-498. It describes the "f-string" approach that was the subject of > the "Briefer string format" thread. I'm open to a better title than > "Literal String Formatting". > > I need to add some text to the discussion section, but I think it's in > reasonable shape. I have a fully working implementation that I'll get > around to posting somewhere this weekend. > > >>> def how_awesome(): return 'very' > ... > >>> f'f-strings are {how_awesome()} awesome!' > 'f-strings are very awesome!' > > I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern:

result = process("string with {a} and {b}").format(a=1, b=2)

which commonly collapses into

result = translate("string with {a} and {b}", a=1, b=2)

by wrapping the concrete use cases in appropriate helper functions.

I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem.

So in my case the vast majority of calls to str.format could be replaced with an f-string. I would also like to believe that other languages that have adopted this approach to string interpolation did so with knowledge that it would be worth it (but then again I don't really know how other languages are developed so this might just be a hope that other languages fret as much as we do about stuff).

I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an improvement.

I firmly believe that the majority of software written in Python does not show up on PyPi, but is used internally in corporations. It's not internationalized or localized: it just exists to get a job done quickly. This is the code that would benefit from f-strings.

This isn't to say that there's not plenty of code where f-strings would not help. But I think it's as big a mistake to generalize from my experience as it is from Stefan's.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/greg%40krypto.org

Peter Ludemann

10:58 p.m.

New subject: PEP-498 & PEP-501: Literal String Formatting/Interpolation

How is this proposal of di"..." more than a different spelling of lambda i"..."? (I think it's a great idea — but am wondering if there are some extra semantics that I missed) I don't think there's any need to preserve the values of the {...} (or ${...}) constituents — the normal closure mechanism should do fine because logging is more-or-less like this: if <conditions for logging>: if callable(msg): log_msg = msg(*args) else: log_msg = msg % args and so there's no need to preserve the values at the moment the interpolated string is created. Perl allows arbitrary expressions inside interpolations, but that tends to get messy and is self-limiting for complex expressions; however, it's handy for things like: print("The {i+1}th item is strange: {x[i]}) On 16 August 2015 at 13:04, Gregory P. Smith <greg@krypto.org> wrote:

...

On Sun, Aug 9, 2015 at 3:25 PM Brett Cannon <brett@python.org> wrote:

...
On Sun, Aug 9, 2015, 13:51 Peter Ludemann via Python-Dev < python-dev@python.org> wrote:

Most of my outputs are log messages, so this proposal won't help me because (I presume) it does eager evaluation of the format string and the logging methods are designed to do lazy evaluation. Python doesn't have anything like Lisp's "special forms", so there doesn't seem to be a way to implicitly put a lambda on the string to delay evaluation.

It would be nice to be able to mark the formatting as lazy ... maybe another string prefix character to indicate that? (And would the 2nd expression in an assert statement be lazy or eager?)

That would require a lazy string type which is beyond the scope of this PEP as proposed since it would require its own design choices, how much code would not like the different type, etc.

-Brett

Agreed that doesn't belong in PEP 498 or 501 itself... But it is a real need.

We left logging behind when we added str.format() and adding yet another _third_ way to do string formatting without addressing the needs of deferred-formatting for things like logging is annoying.

brainstorm: Imagine a deferred interpolation string with a d'' prefix.. di'foo ${bar}' would be a new type with a __str__ method that also retains a runtime reference to the necessary values from the scope within which it was created that will be used for substitutions when iff/when it is __str__()ed. I still wouldn't enjoy reminding people to use di'' inlogging.info(di'thing happened: ${result}') all the time any more than I like reminding people to undo their use of % and just pass the values as additional args to the logging call... But I think people would find it friendlier and thus be more likely to get it right on their own. logging's manually deferred % is an idiom i'd like to see wither away.

There's also a performance aspect to any new formatter, % is oddly pretty fast, str.format isn't. So long as you can do stuff at compile time rather than runtime I think these PEPs could be even faster. Constant string pep-498 or pep-501 formatting could be broken down at compile time and composed into the optimal set of operations to build the resulting string / call the formatter.

So far looking over both peps, I lean towards pep-501 rather than 498:

I really prefer the ${} syntax. I don't like arbitrary logical expressions within strings. I dislike str only things without a similar concept for bytes.

but neither quite suits me yet.

501's __interpolate*__ builtins are good and bad at the same time. doing this at the module level does seem right, i like the i18n use aspect of that, but you could also imagine these being methods so that subclasses could override the behavior on a per-type basis. but that probably only makes sense if a deferred type is created due to when and how interpolates would be called. also, adding builtins, even __ones__ annoys me for some reason I can't quite put my finger on.

(jumping into the threads way late) -gps

...
PS: As to Brett's comment about the history of string interpolation ... my recollection/understanding is that it started with Unix shells and the "$variable" notation, with the "$variable" being evaluated within "..." and not within '...'. Perl, PHP, Make (and others) picked this up. There seems to be a trend to avoid the bare "$variable" form and instead use "${variable}" everywhere, mainly because "${...}" is sometimes required to avoid ambiguities (e.g. "There were $NUMBER ${THING}s.")

PPS: For anyone wishing to improve the existing format options, Common Lisp's FORMAT <http://www.gigamonkeys.com/book/a-few-format-recipes.html> and Prolog's format/2 <https://quintus.sics.se/isl/quintus/html/quintus/mpg-ref-format.html> have some capabilities that I miss from time to time in Python.

On 9 August 2015 at 11:22, Eric V. Smith <eric@trueblade.com> wrote:

On 8/9/2015 1:38 PM, Brett Cannon wrote:

...
On Sun, 9 Aug 2015 at 01:07 Stefan Behnel <stefan_ml@behnel.de

...
<mailto:stefan_ml@behnel.de>> wrote:

Eric V. Smith schrieb am 08.08.2015 um 03:39: > Following a long discussion on python-ideas, I've posted my draft of > PEP-498. It describes the "f-string" approach that was the subject of > the "Briefer string format" thread. I'm open to a better title than > "Literal String Formatting". > > I need to add some text to the discussion section, but I think it's in > reasonable shape. I have a fully working implementation that I'll get > around to posting somewhere this weekend. > > >>> def how_awesome(): return 'very' > ... > >>> f'f-strings are {how_awesome()} awesome!' > 'f-strings are very awesome!' > > I'm open to any suggestions to improve the PEP. Thanks for your feedback.

[copying my comment from python-ideas here]

How common is this use case, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern:

result = process("string with {a} and {b}").format(a=1, b=2)

which commonly collapses into

result = translate("string with {a} and {b}", a=1, b=2)

by wrapping the concrete use cases in appropriate helper functions.

I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem.

So in my case the vast majority of calls to str.format could be replaced with an f-string. I would also like to believe that other languages that have adopted this approach to string interpolation did so with knowledge that it would be worth it (but then again I don't really know how other languages are developed so this might just be a hope that other languages fret as much as we do about stuff).

I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an improvement.

I firmly believe that the majority of software written in Python does not show up on PyPi, but is used internally in corporations. It's not internationalized or localized: it just exists to get a job done quickly. This is the code that would benefit from f-strings.

This isn't to say that there's not plenty of code where f-strings would not help. But I think it's as big a mistake to generalize from my experience as it is from Stefan's.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: https://mail.python.org/mailman/options/python-dev/pludemann%40google.com

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/greg%40krypto.org

David Mertz

9 Aug 9 Aug

6:14 p.m.

On Sun, Aug 9, 2015 at 11:22 AM, Eric V. Smith <eric@trueblade.com> wrote:

...

I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an improvement.

I'm sure that pretty darn close to 100% of all the uses of %-formatting and str.format I've written in the last 13 years COULD be replaced by the proposed f-strings (I suppose about 16 years for me, actually). But I think that every single such replacement would make the programs worse. I'm not sure if it helps to mention that I *did* actually "write the book" on _Text Processing in Python_ :-). The proposal just continues to seem far too magical to me. In the training I now do for Continuum Analytics (I'm in charge of the training program with one other person), I specifically have a (very) little bit of the lessons where I mention something like: print("{foo} is {bar}".format(**locals())) But I give that entirely as a negative example of abusing code and introducing fragility. f-strings are really the same thing, only even more error-prone and easier to get wrong. Relying on implicit context of the runtime state of variables that are merely in scope feels very break-y to me still. If I had to teach f-strings in the future, I'd teach it as a Python wart. That said, there *is* one small corner where I believe f-strings add something helpful to the language. There is no really concise way to spell: collections.ChainMap(locals(), globals(), __builtins__.__dict__). If we could spell that as, say `lgb()`, that would let str.format() or %-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea. Yours, David... -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

David Mertz

6:54 p.m.

Y'know, I just read a few more posts over on python-ideas that I had missed somehow. I saw Guido's point about `**locals()` being too specialized and magical for beginners, which I agree with. And it's the other aspect of "magic" that makes me not like f-strings. The idea of *implicitly* getting values from the local scope (or really, the global_local_builtin scope) makes me worry about readers of code very easily missing what's really going on within an f-string. I don't actually care about the code injection issues and that sort of thing. I mean, OK I care a little bit, but my actual concern is purely explicitness and readability. Which brought to mind a certain thought. While I don't like: f'My name is {name}, my age next year is {age+1}' I wouldn't have any similar objection to: 'My name is {name}, my age next year is {age+1}'.scope_format() Or scope_format('My name is {name}, my age next year is {age+1}') I realize that these could be completely semantically equivalent... but the function or method call LOOKS LIKE a runtime operation, while a one letter prefix just doesn't look like that (especially to beginners whom I might teach). The name 'scope_format' is ugly, and something shorter would be nicer, but I think this conveys my idea. Yours, David... On Sun, Aug 9, 2015 at 6:14 PM, David Mertz <mertz@gnosis.cx> wrote:

...

On Sun, Aug 9, 2015 at 11:22 AM, Eric V. Smith <eric@trueblade.com> wrote:

...
I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an improvement.

I'm sure that pretty darn close to 100% of all the uses of %-formatting and str.format I've written in the last 13 years COULD be replaced by the proposed f-strings (I suppose about 16 years for me, actually). But I think that every single such replacement would make the programs worse. I'm not sure if it helps to mention that I *did* actually "write the book" on _Text Processing in Python_ :-).

The proposal just continues to seem far too magical to me. In the training I now do for Continuum Analytics (I'm in charge of the training program with one other person), I specifically have a (very) little bit of the lessons where I mention something like:

print("{foo} is {bar}".format(**locals()))

But I give that entirely as a negative example of abusing code and introducing fragility. f-strings are really the same thing, only even more error-prone and easier to get wrong. Relying on implicit context of the runtime state of variables that are merely in scope feels very break-y to me still. If I had to teach f-strings in the future, I'd teach it as a Python wart.

That said, there *is* one small corner where I believe f-strings add something helpful to the language. There is no really concise way to spell:

collections.ChainMap(locals(), globals(), __builtins__.__dict__).

If we could spell that as, say `lgb()`, that would let str.format() or %-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea.

Yours, David...

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Steven D'Aprano

10 Aug 10 Aug

10:26 a.m.

On Sun, Aug 09, 2015 at 06:54:38PM -0700, David Mertz wrote:

...

Which brought to mind a certain thought. While I don't like:

f'My name is {name}, my age next year is {age+1}'

I wouldn't have any similar objection to:

'My name is {name}, my age next year is {age+1}'.scope_format()

Or

scope_format('My name is {name}, my age next year is {age+1}')

I realize that these could be completely semantically equivalent... but the function or method call LOOKS LIKE a runtime operation, while a one letter prefix just doesn't look like that (especially to beginners whom I might teach).

I fear that this is actually worse than the f-string concept. f-strings, as far as I understand, are literals. (Well, not exactly literals.) You cannot say: # this can't happen (I think?) expr = 'age + 1' result = f'blah blah blah {' + expr + '}' and inject the expression into the f-string. That makes them a little weaker than eval(), and hence a little safer. But scope_format would have to be eval in disguise, since it receives a string as argument, and it can't know where it came from or how it came to be: # pretend that expr comes from, say, a web form expr = 'age + 1}{os.system("echo Pwned!") and ""' result = scope_format( 'My name is {name}, my age next year is {' + expr + '}' ) It's a dilemma, because I'm completely with you in your discomfort in having something which looks like a string literal actually be a function of sorts; but turning it into an actual function makes it more dangerous, not less. I think I would be happy with f-strings, or perhaps i-strings if we use Nick's ideas about internationalisation, and limit what they can evaluate to name lookups, attribute lookups, and indexing, just like format(). We can always relax that restriction in the future, if necessary, but it's a lot harder to tighten it. -- Steve

Eric V. Smith

11:28 a.m.

On 08/10/2015 01:26 PM, Steven D'Aprano wrote:

...

On Sun, Aug 09, 2015 at 06:54:38PM -0700, David Mertz wrote:

...
Which brought to mind a certain thought. While I don't like:

f'My name is {name}, my age next year is {age+1}'

I wouldn't have any similar objection to:

'My name is {name}, my age next year is {age+1}'.scope_format()

Or

scope_format('My name is {name}, my age next year is {age+1}')

I realize that these could be completely semantically equivalent... but the function or method call LOOKS LIKE a runtime operation, while a one letter prefix just doesn't look like that (especially to beginners whom I might teach).

I fear that this is actually worse than the f-string concept. f-strings, as far as I understand, are literals. (Well, not exactly literals.) You cannot say:

# this can't happen (I think?) expr = 'age + 1' result = f'blah blah blah {' + expr + '}'

and inject the expression into the f-string. That makes them a little weaker than eval(), and hence a little safer.

Correct. f-strings only work on literals. They essentially convert the f-string literal into an expression (which is not strictly specified in the PEP, but it has examples).

...

But scope_format would have to be eval in disguise, since it receives a string as argument, and it can't know where it came from or how it came to be:

# pretend that expr comes from, say, a web form expr = 'age + 1}{os.system("echo Pwned!") and ""' result = scope_format( 'My name is {name}, my age next year is {' + expr + '}' )

It's a dilemma, because I'm completely with you in your discomfort in having something which looks like a string literal actually be a function of sorts; but turning it into an actual function makes it more dangerous, not less.

I think I would be happy with f-strings, or perhaps i-strings if we use Nick's ideas about internationalisation, and limit what they can evaluate to name lookups, attribute lookups, and indexing, just like format().

We can always relax that restriction in the future, if necessary, but it's a lot harder to tighten it.

This desire, which many people have expressed, is not completely lost on me. Eric.

Barry Warsaw

11:31 a.m.

On Aug 11, 2015, at 03:26 AM, Steven D'Aprano wrote:

...

I think I would be happy with f-strings, or perhaps i-strings if we use Nick's ideas about internationalisation, and limit what they can evaluate to name lookups, attribute lookups, and indexing, just like format().

I still think you really only need name lookups, especially for an i18n context. Anything else is just overkill, YAGNI, potentially error prone, or perhaps even harmful. Remember that the translated strings usually come from only moderately (if at all) trusted and verified sources, so it's entirely possible that a malicious translator could sneak in an exploit, especially if you're evaluating arbitrary expressions. If you're only doing name substitutions, then the worst that can happen is an information leak, which is bad, but won't compromise the integrity of say a server using the translation. Even if the source strings avoid the use of expressions, if the feature is available, a translator could still sneak something in. That pretty much makes it a non-starter for i18n, IMHO. Besides, any expression you have to calculate can go in a local that will get interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders. Cheers, -Barry

Eric V. Smith

11:37 a.m.

On 08/10/2015 02:31 PM, Barry Warsaw wrote:

...

On Aug 11, 2015, at 03:26 AM, Steven D'Aprano wrote:

...
I think I would be happy with f-strings, or perhaps i-strings if we use Nick's ideas about internationalisation, and limit what they can evaluate to name lookups, attribute lookups, and indexing, just like format().

I still think you really only need name lookups, especially for an i18n context. Anything else is just overkill, YAGNI, potentially error prone, or perhaps even harmful.

Remember that the translated strings usually come from only moderately (if at all) trusted and verified sources, so it's entirely possible that a malicious translator could sneak in an exploit, especially if you're evaluating arbitrary expressions. If you're only doing name substitutions, then the worst that can happen is an information leak, which is bad, but won't compromise the integrity of say a server using the translation.

Even if the source strings avoid the use of expressions, if the feature is available, a translator could still sneak something in. That pretty much makes it a non-starter for i18n, IMHO.

Besides, any expression you have to calculate can go in a local that will get interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders.

This is why I think PEP-498 isn't the solution for i18n. I'd really like to be able to say, in a debugging context: print('a:{self.a} b:{self.b} c:{self.c} d:{self.d}') without having to create locals to hold these 4 values. Eric.

Yury Selivanov

11:44 a.m.

On 2015-08-10 2:37 PM, Eric V. Smith wrote:

...

...
Besides, any expression you have to calculate can go in a local that will get

...
interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders. This is why I think PEP-498 isn't the solution for i18n. I'd really like to be able to say, in a debugging context:

print('a:{self.a} b:{self.b} c:{self.c} d:{self.d}')

without having to create locals to hold these 4 values.

Why can't we restrict expressions in f-strings to attribute/item getters? I.e. allow f'{foo.bar.baz}' and f'{self.foo["bar"]}' but disallow f'{foo.bar(baz=something)}' Yury

Eric V. Smith

11:49 a.m.

On 08/10/2015 02:44 PM, Yury Selivanov wrote:

...

On 2015-08-10 2:37 PM, Eric V. Smith wrote:

...
...
Besides, any expression you have to calculate can go in a local that will get

...
interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders. This is why I think PEP-498 isn't the solution for i18n. I'd really like to be able to say, in a debugging context:

print('a:{self.a} b:{self.b} c:{self.c} d:{self.d}')

without having to create locals to hold these 4 values.

Why can't we restrict expressions in f-strings to attribute/item getters?

I.e. allow f'{foo.bar.baz}' and f'{self.foo["bar"]}' but disallow f'{foo.bar(baz=something)}'

It's possible. But my point is that Barry doesn't even want attribute/item getters for an i18n solution, and I'm not willing to restrict it that much. Eric.

Carl Meyer

12:04 p.m.

On 08/10/2015 02:49 PM, Eric V. Smith wrote:

...

On 08/10/2015 02:44 PM, Yury Selivanov wrote:

...
On 2015-08-10 2:37 PM, Eric V. Smith wrote:

...
...
Besides, any expression you have to calculate can go in a local that will get

...
interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders. This is why I think PEP-498 isn't the solution for i18n. I'd really like to be able to say, in a debugging context:

print('a:{self.a} b:{self.b} c:{self.c} d:{self.d}')

without having to create locals to hold these 4 values.

Why can't we restrict expressions in f-strings to attribute/item getters?

I.e. allow f'{foo.bar.baz}' and f'{self.foo["bar"]}' but disallow f'{foo.bar(baz=something)}'

It's possible. But my point is that Barry doesn't even want attribute/item getters for an i18n solution, and I'm not willing to restrict it that much.

I don't think attribute access and item access are on the same level here. In terms of readability of the resulting string literal, it would be reasonable to allow attribute access but disallow item access. And I think attribute access is reasonable to allow in the context of an i18n solution as well (but item access is not). Item access is much harder to read and easier for translators to mess up because of all the extra punctuation (and the not-obvious-to-a-non-programmer distinction between a literal or variable key). There's also the solution used by the Django and Jinja templating languages, where dot-notation can mean either attribute access (preferentially) or item access with literal key (as fallback). That manages to achieve both a high level of readability of the literal/template, and a high level of flexibility for the context provider (who may find it easier to provide a dictionary than an object), but may fail the "too different from Python" test. Carl

Wes Turner

7:59 p.m.

On Mon, Aug 10, 2015 at 2:04 PM, Carl Meyer <carl@oddbird.net> wrote:

...

...
On 08/10/2015 02:44 PM, Yury Selivanov wrote:

...
On 2015-08-10 2:37 PM, Eric V. Smith wrote:

...
...
Besides, any expression you have to calculate can go in a local that will get

...
interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders. This is why I think PEP-498 isn't the solution for i18n. I'd really

On 08/10/2015 02:49 PM, Eric V. Smith wrote: like

...
...
...
to be able to say, in a debugging context:

print('a:{self.a} b:{self.b} c:{self.c} d:{self.d}')

without having to create locals to hold these 4 values.

Why can't we restrict expressions in f-strings to attribute/item getters?

I.e. allow f'{foo.bar.baz}' and f'{self.foo["bar"]}' but disallow f'{foo.bar(baz=something)}'

It's possible. But my point is that Barry doesn't even want attribute/item getters for an i18n solution, and I'm not willing to restrict it that much.

I don't think attribute access and item access are on the same level here. In terms of readability of the resulting string literal, it would be reasonable to allow attribute access but disallow item access. And I think attribute access is reasonable to allow in the context of an i18n solution as well (but item access is not). Item access is much harder to read and easier for translators to mess up because of all the extra punctuation (and the not-obvious-to-a-non-programmer distinction between a literal or variable key).

There's also the solution used by the Django and Jinja templating languages, where dot-notation can mean either attribute access (preferentially) or item access with literal key (as fallback). That manages to achieve both a high level of readability of the literal/template, and a high level of flexibility for the context provider (who may find it easier to provide a dictionary than an object), but may fail the "too different from Python" test.

References for (these) PEPs: One advantage of Python HAVING required explicit template format interpolation string contexts is that to do string language formatting correctly (e.g. *for anything other than printing strings to console* or with formats with defined field/record boundary delimiters (which, even then, may contain shell control escape codes)) we've had to write and use external modules which are specific to the output domain (JSON, HTML, CSS, SQL, SPARQL, CSS, [...]). There are a number of posts about operator syntax, which IMHO, regardless, it's not convenient enough to lose this distinctive 'security' feature (explicit variable bindings for string interpolation) of Python as a scripting language as compared to e.g. Perl, Ruby. Jinja2 reimplements and extends Django template syntax -{% for %}{{variable_or_expr | filtercallable}}-{% endfor %} * Jinja2 supports configurable operators {{ can instead be !! or !{ or ${ or ?? * Because it is a compilable function composition, Jinja2 supports extensions: https://github.com/mitsuhiko/jinja2/blob/master/tests/test_ext.py * Jinja2 supports {% trans %}, _(''), and gettext("") babel-style i18n http://jinja.pocoo.org/docs/dev/templates/#i18n * Jinja2 supports autoescaping: http://jinja.pocoo.org/docs/dev/api/#autoescaping (e.g. 'jinja2.ext.autoescape' AutoEscapeExtension [ScopedEvalContextModifier]) https://github.com/mitsuhiko/jinja2/blob/master/jinja2/ext.py#L434 * preprocessors and things are then just jinja2.ext.Extension s. * Jinja2 accepts an explicit context (where merge(globals, locals, kwargs) just feels wrong because it is, ... [ ] lookup(**kwargs), lngb(**kwargs)) (salt pillar merges)) ~ collections.abc.MutableMapping: https://docs.python.org/3/library/collections.abc.html#collections.abc.Mutab... * Jinja2 marks strings with MarkupSafe (in order to prevent e.g. multiple escaping, lack of escaping) https://pypi.python.org/pypi/MarkupSafe f-strings would make it too easy for me to do the wrong thing; which other language don't prevent (this does occur often [CWE Top 25 2011]), and I regard this as a current feature of Python.

...

Carl

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

Guido van Rossum

12:23 p.m.

On Mon, Aug 10, 2015 at 8:49 PM, Eric V. Smith <eric@trueblade.com> wrote:

...

On 08/10/2015 02:44 PM, Yury Selivanov wrote:

...
On 2015-08-10 2:37 PM, Eric V. Smith wrote:

...
This is why I think PEP-498 isn't the solution for i18n. I'd really like to be able to say, in a debugging context:

print('a:{self.a} b:{self.b} c:{self.c} d:{self.d}')

without having to create locals to hold these 4 values.

Why can't we restrict expressions in f-strings to attribute/item getters?

I.e. allow f'{foo.bar.baz}' and f'{self.foo["bar"]}' but disallow f'{foo.bar(baz=something)}'

It's possible. But my point is that Barry doesn't even want attribute/item getters for an i18n solution, and I'm not willing to restrict it that much.

I also don't want to tie this closely to i18n. That is (still) very much a wold of its own. What I want with f-strings (by any name) is a way to generalize from print() calls with multiple arguments. We can write print('Todo:', len(self.todo), '; busy:', len(self.busy)) but the same thing is more awkward when you have to pass it as a single string to a function that just sends one string somewhere. And note that the above example inserts a space before the ';' which I don't really like. So it would be nice if instead we could write print(f'Todo: {len(self.todo)}; busy: {len(self.busy)}') which IMO is just as readable as the multi-arg print() call[1], and generalizes to other functions besides print(). In fact, the latter form has less punctuation noise than the former -- every time you insert an expression in a print() call, you have a quote+comma before it and a comma+quote after it, compared to a brace before and one after in the new form. (Note that this is an argument for using f'{...}' rather than '\{...}' -- for a single interpolation it's the same amount of typing, but for multiple interpolations, f'{...}{...}' is actually shorter than '\{...}\{...}', and also the \{ part is ugly.) Anyway, this generalization from print() is why I want arbitrary expressions. Wouldn't it be silly if we introduced print() today and said "we don't really like to encourage printing complicated expressions, but maybe we can introduce them in a future version"... :-) Continuing the print()-generalization theme, if things become too long to fit on a line we can write print('Todo:', len(self.todo), '; busy:', len(self.busy)) Can we allow the same in f-strings? E.g. print(f'Todo: {len(self.todo) }; busy: {len(self.busy) }') or is that too ugly? It could also be solved using implicit concatenation, e.g. print(f'Todo: {len(self.todo)}; ' f'busy: {len(self.busy)}') [1] Assuming syntax colorizers catch on. -- --Guido van Rossum (python.org/~guido)

MRAB

12:57 p.m.

On 2015-08-10 20:23, Guido van Rossum wrote:

...

On Mon, Aug 10, 2015 at 8:49 PM, Eric V. Smith <eric@trueblade.com <mailto:eric@trueblade.com>> wrote:

On 08/10/2015 02:44 PM, Yury Selivanov wrote: > On 2015-08-10 2:37 PM, Eric V. Smith wrote: >> This is why I think PEP-498 isn't the solution for i18n. I'd really like >> to be able to say, in a debugging context: >> >> print('a:{self.a} b:{self.b} c:{self.c} d:{self.d}') >> >> without having to create locals to hold these 4 values. > > Why can't we restrict expressions in f-strings to > attribute/item getters? > > I.e. allow f'{foo.bar.baz}' and f'{self.foo["bar"]}' but > disallow f'{foo.bar(baz=something)}'

It's possible. But my point is that Barry doesn't even want attribute/item getters for an i18n solution, and I'm not willing to restrict it that much.

I also don't want to tie this closely to i18n. That is (still) very much a wold of its own.

What I want with f-strings (by any name) is a way to generalize from print() calls with multiple arguments. We can write

print('Todo:', len(self.todo), '; busy:', len(self.busy))

but the same thing is more awkward when you have to pass it as a single string to a function that just sends one string somewhere. And note that the above example inserts a space before the ';' which I don't really like. So it would be nice if instead we could write

print(f'Todo: {len(self.todo)}; busy: {len(self.busy)}')

which IMO is just as readable as the multi-arg print() call[1], and generalizes to other functions besides print().

In fact, the latter form has less punctuation noise than the former -- every time you insert an expression in a print() call, you have a quote+comma before it and a comma+quote after it, compared to a brace before and one after in the new form. (Note that this is an argument for using f'{...}' rather than '\{...}' -- for a single interpolation it's the same amount of typing, but for multiple interpolations, f'{...}{...}' is actually shorter than '\{...}\{...}', and also the \{ part is ugly.)

Anyway, this generalization from print() is why I want arbitrary expressions. Wouldn't it be silly if we introduced print() today and said "we don't really like to encourage printing complicated expressions, but maybe we can introduce them in a future version"... :-)

Continuing the print()-generalization theme, if things become too long to fit on a line we can write

print('Todo:', len(self.todo), '; busy:', len(self.busy))

Can we allow the same in f-strings? E.g.

print(f'Todo: {len(self.todo) }; busy: {len(self.busy) }')

or is that too ugly? It could also be solved using implicit concatenation, e.g.

print(f'Todo: {len(self.todo)}; ' f'busy: {len(self.busy)}')

[1] Assuming syntax colorizers catch on.

I'd expect f'...' to follow similar rules to '...'. You could escape it: print(f'Todo: {len(self.todo)\ }; busy: {len(self.busy)\ }') which would be equivalent to: print(f'Todo: {len(self.todo) }; busy: {len(self.busy) }') or use triple-quoted a f-string: print(f'''Todo: {len(self.todo) }; busy: {len(self.busy) }''') which would be equivalent to: print(f'Todo: {len(self.todo)\n }; busy: {len(self.busy)\n }') (I think it might be OK to have a newline in the expression because it's wrapped in {...}.)

Steven D'Aprano

7:17 p.m.

On Mon, Aug 10, 2015 at 09:23:15PM +0200, Guido van Rossum wrote: [...]

...

Anyway, this generalization from print() is why I want arbitrary expressions. Wouldn't it be silly if we introduced print() today and said "we don't really like to encourage printing complicated expressions, but maybe we can introduce them in a future version"... :-)

That's a straw-man argument. Nobody is arguing against allowing arbitrary expressions as arguments to functions. If you want a fair analogy, how about the reluctance to allow arbitrary expressions as decorators? @[spam, eggs, cheese][switch] def function(): ... As far as I can see, the non-straw argument is that f-strings be limited to the same subset of expressions that format() accepts: name and attribute look-ups, and indexing. -- Steve

Barry Warsaw

12:38 p.m.

On Aug 10, 2015, at 02:49 PM, Eric V. Smith wrote:

...

It's possible. But my point is that Barry doesn't even want attribute/item getters for an i18n solution, and I'm not willing to restrict it that much.

Actually, attribute chasing is generally fine, and flufl.i18n supports that. Translators can handle $foo.bar although you still do have to be careful about information leaks ("choose your foo's carefully"). Item getters have been more YAGNI than anything else. Cheers, -Barry

Nathaniel Smith

2:51 p.m.

On Aug 10, 2015 11:33 AM, "Barry Warsaw" <barry@python.org> wrote:

...

On Aug 11, 2015, at 03:26 AM, Steven D'Aprano wrote:

...
I think I would be happy with f-strings, or perhaps i-strings if we use Nick's ideas about internationalisation, and limit what they can

evaluate to

...

...
name lookups, attribute lookups, and indexing, just like format().

I still think you really only need name lookups, especially for an i18n context. Anything else is just overkill, YAGNI, potentially error prone, or perhaps even harmful.

Remember that the translated strings usually come from only moderately (if at all) trusted and verified sources, so it's entirely possible that a malicious translator could sneak in an exploit, especially if you're evaluating arbitrary expressions. If you're only doing name substitutions, then the worst that can happen is an information leak, which is bad, but won't compromise the integrity of say a server using the translation.

Even if the source strings avoid the use of expressions, if the feature is available, a translator could still sneak something in. That pretty much makes it a non-starter for i18n, IMHO.

Besides, any expression you have to calculate can go in a local that will get interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders.

IIUC what Nick contemplates in PEP 501 is that when you write something like i"I am ${self.age}" then the python runtime would itself evaluate self.age and pass it on to the i18n machinery to do the actual substitution; the i18n machinery wouldn't even contain any calls to eval. The above string could be translated as i"Tengo ${self.age} años" but i"Tengo ${self.password} años" would be an error, because the runtime did not provide a value for self.password. So while arbitrarily complex expressions are allowed (at least as far as the language is concerned -- a given project or i18n toolkit could impose additional policy restrictions), by the time the interpolation machinery runs they'll effectively have been reduced to local variables with funny multi-token names. This pretty much eliminates all the information leak and exploit concerns, AFAICT. From your comments about having to be careful about attribute chasing, it sounds like it might even be more robust than current flufl.i18n in this regard...? -n

Wes Turner

7:11 p.m.

On Aug 10, 2015 4:52 PM, "Nathaniel Smith" <njs@pobox.com> wrote:

...

On Aug 10, 2015 11:33 AM, "Barry Warsaw" <barry@python.org> wrote:

...
On Aug 11, 2015, at 03:26 AM, Steven D'Aprano wrote:

...
I think I would be happy with f-strings, or perhaps i-strings if we use Nick's ideas about internationalisation, and limit what they can

...

...
...
name lookups, attribute lookups, and indexing, just like format().

I still think you really only need name lookups, especially for an i18n context. Anything else is just overkill, YAGNI, potentially error

...

...
perhaps even harmful.

Remember that the translated strings usually come from only moderately (if at all) trusted and verified sources, so it's entirely possible that a malicious translator could sneak in an exploit, especially if you're evaluating arbitrary expressions. If you're only doing name substitutions, then

...

...
worst that can happen is an information leak, which is bad, but won't compromise the integrity of say a server using the translation.

Even if the source strings avoid the use of expressions, if the feature is available, a translator could still sneak something in. That pretty much makes it a non-starter for i18n, IMHO.

Besides, any expression you have to calculate can go in a local that will get interpolated. The same goes for any !r or other formatting modifiers. In an i18n context, you want to stick to the simplest possible substitution placeholders.

IIUC what Nick contemplates in PEP 501 is that when you write something

...

i"I am ${self.age}" then the python runtime would itself evaluate self.age and pass it on to

evaluate to prone, or the like the i18n machinery to do the actual substitution; the i18n machinery wouldn't even contain any calls to eval. The above string could be translated as

...

i"Tengo ${self.age} años" but i"Tengo ${self.password} años" would be an error, because the runtime did not provide a value for self.password. So while arbitrarily complex expressions are allowed (at least as far as the language is concerned -- a given project or i18n toolkit could impose additional policy restrictions), by the time the interpolation machinery runs they'll effectively have been reduced to local variables with funny multi-token names.

This pretty much eliminates all the information leak and exploit concerns, AFAICT. From your comments about having to be careful about attribute chasing, it sounds like it might even be more robust than current flufl.i18n in this regard...?

No, those remain; but minimizing calls to eval is good, too. I prefer explicit template context for good reason: * scope / variable binding in list comprehensions, * "it was called 'cmd' two nested scopes ago" Again, convenient but dangerous (Django and Jinja can/do autoescaping) and making it far too easy to wrongly quote and not escape strings (which often contain domain-specific) control characters.

...

-n

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

...

Alexander Walters

11 Aug 11 Aug

8:09 a.m.

This may seam like a simplistic solution to i18n, but why not just add a method to string objects (assuming we implement f-strings) that just returns the original, unprocessed string. If the string was not an f-string, it just returns self. The gettext module can be modified, I think trivially, to use the method instead of the string directly. Is this a horrible idea? - Alex W.

Eric V. Smith

8:16 a.m.

On 08/11/2015 11:09 AM, Alexander Walters wrote:

...

This may seam like a simplistic solution to i18n, but why not just add a method to string objects (assuming we implement f-strings) that just returns the original, unprocessed string. If the string was not an f-string, it just returns self. The gettext module can be modified, I think trivially, to use the method instead of the string directly.

You need the original string, in order to figure out what it translates to. You need the values to replace into that string, evaluated at runtime, in the context of where the string appears. And you need to know where in the original (or translated) string to put them. The problem is that there's no way to evaluate the values and, before they're substituted in to the string, use a different template string with obvious substitution points. This is what PEP 501 is trying to do. Eric.

Alexander Walters

8:26 a.m.

...

On 08/11/2015 11:09 AM, Alexander Walters wrote:

...
This may seam like a simplistic solution to i18n, but why not just add a method to string objects (assuming we implement f-strings) that just returns the original, unprocessed string. If the string was not an f-string, it just returns self. The gettext module can be modified, I think trivially, to use the method instead of the string directly. You need the original string, in order to figure out what it translates to. You need the values to replace into that string, evaluated at runtime, in the context of where the string appears. And you need to know where in the original (or translated) string to put them.

The problem is that there's no way to evaluate the values and, before they're substituted in to the string, use a different template string with obvious substitution points. This is what PEP 501 is trying to do.

Eric. I don't understand some of that. We already trust translators with _('foo {bar}').format(bar=bar) to not mess up the {bar} in the string, so the that wont change. Is the issue handing the string back to python to be formatted? Could gettext not make the same AST as an f-string would, and hand that back to python? If you add a method to strings

On 8/11/2015 11:16, Eric V. Smith wrote: that returns the un-f-string-processed version of the string, doesn't that make all these problems solvable without pep-501?

Sven R. Kunze

10:25 a.m.

Couldn't you just store the original format string at some __format_str__ attribute at the formatted string? Just in case you need it. x = f'{a}' => x = '{}'.format(a) # or whatever it turns out to be x.__format_str__ = '{a}' On 11.08.2015 17:16, Eric V. Smith wrote:

...

On 08/11/2015 11:09 AM, Alexander Walters wrote:

...
This may seam like a simplistic solution to i18n, but why not just add a method to string objects (assuming we implement f-strings) that just returns the original, unprocessed string. If the string was not an f-string, it just returns self. The gettext module can be modified, I think trivially, to use the method instead of the string directly. You need the original string, in order to figure out what it translates to. You need the values to replace into that string, evaluated at runtime, in the context of where the string appears. And you need to know where in the original (or translated) string to put them.

The problem is that there's no way to evaluate the values and, before they're substituted in to the string, use a different template string with obvious substitution points. This is what PEP 501 is trying to do.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de

Eric V. Smith

10:33 a.m.

On 08/11/2015 01:25 PM, Sven R. Kunze wrote:

...

Couldn't you just store the original format string at some __format_str__ attribute at the formatted string? Just in case you need it.

x = f'{a}'

=>

x = '{}'.format(a) # or whatever it turns out to be x.__format_str__ = '{a}'

Yes. But I think the i18n problem, as evidenced by the differences in PEPs 498 and 501, relate to the expression evaluation, not to keeping the original string. But if people think that this helps the i18n problem, I suggest proposing concrete changes to PEP 501. Eric.

...

On 11.08.2015 17:16, Eric V. Smith wrote:

...
On 08/11/2015 11:09 AM, Alexander Walters wrote:

...
This may seam like a simplistic solution to i18n, but why not just add a method to string objects (assuming we implement f-strings) that just returns the original, unprocessed string. If the string was not an f-string, it just returns self. The gettext module can be modified, I think trivially, to use the method instead of the string directly. You need the original string, in order to figure out what it translates to. You need the values to replace into that string, evaluated at runtime, in the context of where the string appears. And you need to know where in the original (or translated) string to put them.

The problem is that there's no way to evaluate the values and, before they're substituted in to the string, use a different template string with obvious substitution points. This is what PEP 501 is trying to do.

Eric.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40tru...

Wes Turner

8:19 a.m.

On Aug 11, 2015 10:10 AM, "Alexander Walters" <tritium-list@sdamon.com> wrote:

...

This may seam like a simplistic solution to i18n, but why not just add a

method to string objects (assuming we implement f-strings) that just returns the original, unprocessed string. If the string was not an f-string, it just returns self. The gettext module can be modified, I think trivially, to use the method instead of the string directly.

...

Is this a horrible idea?

This is a backward compatible macro to elide code in strings that should not be. * IIUC, this would only be usable in 3.6+ (so, not at all and style guide says NO) * there should be a normal functional() way to accomplish this in a backwards compatible way * formatlng() / lookup() would be more future compatible

...

- Alex W.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

Wes Turner

8:28 a.m.

On Aug 11, 2015 10:19 AM, "Wes Turner" <wes.turner@gmail.com> wrote:

...

On Aug 11, 2015 10:10 AM, "Alexander Walters" <tritium-list@sdamon.com>

...

...
This may seam like a simplistic solution to i18n, but why not just add

a method to string objects (assuming we implement f-strings) that just returns the original, unprocessed string. If the string was not an f-string, it just returns self. The gettext module can be modified, I

wrote: think trivially, to use the method instead of the string directly.

...

...
Is this a horrible idea?

- [ ] review all string interpolation (for "injection") * [ ] review every '%' * [ ] review every ".format()" * [ ] review every f-string (AND LOCALS AND GLOBALS) * every os.system, os.exec*, subprocess.Popen * every unclosed tag * every unescaped control character This would create work we don't need. Solution: __str_shell_ escapes, adds slashes, and quotes. __str__SQL__ refs a global list of reserved words.

...

This is a backward compatible macro to elide code in strings that should

not be.

...

* IIUC, this would only be usable in 3.6+ (so, not at all and style guide

says NO)

...

* there should be a normal functional() way to accomplish this in a backwards compatible way * formatlng() / lookup() would be more future compatible

...
- Alex W.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

Alexander Walters

8:52 a.m.

On 8/11/2015 11:28, Wes Turner wrote:

...

On Aug 11, 2015 10:19 AM, "Wes Turner" <wes.turner@gmail.com <mailto:wes.turner@gmail.com>> wrote:

- [ ] review all string interpolation (for "injection") * [ ] review every '%' * [ ] review every ".format()" * [ ] review every f-string (AND LOCALS AND GLOBALS) * every os.system, os.exec*, subprocess.Popen * every unclosed tag * every unescaped control character

This would create work we don't need.

Solution: __str_shell_ escapes, adds slashes, and quotes. __str__SQL__ refs a global list of reserved words.

...

...
...
'foo bar'.hypothetical() # returns 'foo bar' '{0} bar'.format('foo').hypothetical() # returns 'foo bar' ('%s bar' % ('foo',)).hypothetical() # returns 'foo bar' f'{foo} bar'.hypothetical() # returns '{foo} bar', prime for

I don't understand why % and .format got interjected into this. If you are mentioning them as 'get the unprocessed version of any string formatting', that is a bad idea, and not needed, since you already have an unprocessed string object. Assuming the method were named "hypothetical": translation. could gettext not be modified to create the same AST as f'{foo} bar' when it is translated to '{foo} le bar.' and inject it back into the runtime?

Wes Turner

10:43 a.m.

On Tue, Aug 11, 2015 at 10:52 AM, Alexander Walters <tritium-list@sdamon.com

...

wrote:

...

On 8/11/2015 11:28, Wes Turner wrote:

On Aug 11, 2015 10:19 AM, "Wes Turner" <wes.turner@gmail.com> wrote:

- [ ] review all string interpolation (for "injection") * [ ] review every '%' * [ ] review every ".format()" * [ ] review every f-string (AND LOCALS AND GLOBALS) * every os.system, os.exec*, subprocess.Popen * every unclosed tag * every unescaped control character

This would create work we don't need.

Solution: __str_shell_ escapes, adds slashes, and quotes. __str__SQL__ refs a global list of reserved words.

I don't understand why % and .format got interjected into this.

If you are mentioning them as 'get the unprocessed version of any string formatting', that is a bad idea, and not needed, since you already have an unprocessed string object. Assuming the method were named "hypothetical":

...
...
...
'foo bar'.hypothetical() # returns 'foo bar' '{0} bar'.format('foo').hypothetical() # returns 'foo bar' ('%s bar' % ('foo',)).hypothetical() # returns 'foo bar' f'{foo} bar'.hypothetical() # returns '{foo} bar', prime for translation.

could gettext not be modified to create the same AST as f'{foo} bar' when it is translated to '{foo} le bar.' and inject it back into the runtime?

well, we're talking about a functional [series of] transformations on __str__ (or __unicode__); with globals and locals, and more-or-less a macro for eliding this (**frequently wrong** because when is a string not part of an output format with control characters that need to be escaped before they're interpolated in). % and str.format, (and gettext), are the current ways to do this, and they are also **frequently wrong** because HTML, SQL. The risk with this additional syntax is that unescaped globals and locals are transcluded (and/or translated); with an explicit (combination of) string prefixes to indicate forwards-compatible functional composition (of usually mutable types).

Stephen J. Turnbull

12:33 a.m.

Barry Warsaw writes:

...

Besides, any expression you have to calculate can go in a local that will get interpolated.

Sure, but that style should be an application programmer choice. If this syntax can't replace the vast majority of cases where the format method is invoked on a literal string without requiring introduction of gratuitous temporaries, I don't see the point. By "invoked", I mean the arguments to the format method, too, so even function calls should be permitted. To me it's not worth the expense of learning and teaching the differences otherwise. If that point of view were generally accepted, it seems to me that it kills this idea of using the same syntax for programmer interpolation and for translation interpolation. The two use cases present requirements that are too different since translators are generally "third party volunteers", *not* "trusted contributors". Nor are their contributions generally reviewed by "core".

...

In an i18n context, you want to stick to the simplest possible substitution placeholders.

Certainly, and in that case I think format strings with simple variable and attribute interpolation, plus an explicit invocation of the format method comprise TOOWDTI -- it's exactly what you want! In fact, I am now -1 on an implicitly formatted I18N quasi-literal. It seems to me that in fact we should think of such an internationalized string as merely an obfuscated way of spelling variable_input_by_user. The current I18N frameworks make this clear by requiring a function call, which theoretically could return any string and have any side effects -- but these are controlled by the programmer. But there are other contexts, equally important, where a more compact, implicit formatting syntax would be very valuable, starting with scripting. BTW, I know application programmers hate those calls. I wonder if they can't be folded into str.format, with a dummy string prefix of "i" (or "_"!) being allowed solely to trigger xgettext and similar potfile extraction utilities? So you'd write

...

...
...
s = i"Please translate this {adjective} string." s.format(adjective=i"beautiful", gettext=('ja', None)) "この美しい文字列を訳してくださいませ。"

where the first component of gettext is the language and the second is the gettext domain (defaulting to the current application). If that works, the transformation from monolingual application to internationalized application is sufficiently mechanical that a non- programmer could be easily taught to perform it.

Wes Turner

9 Aug 9 Aug

8:04 p.m.

On Aug 9, 2015 8:14 PM, "David Mertz" <mertz@gnosis.cx> wrote:

...

On Sun, Aug 9, 2015 at 11:22 AM, Eric V. Smith <eric@trueblade.com> wrote:

...
I think it has to do with the nature of the programs that people write. I write software for internal use in a large company. In the last 13 years there, I've written literally hundreds of individual programs, large and small. I just checked: literally 100% of my calls to %-formatting (older code) or str.format (in newer code) could be replaced with f-strings. And I think every such use would be an

improvement.

...

I'm sure that pretty darn close to 100% of all the uses of %-formatting

and str.format I've written in the last 13 years COULD be replaced by the proposed f-strings (I suppose about 16 years for me, actually). But I think that every single such replacement would make the programs worse. I'm not sure if it helps to mention that I *did* actually "write the book" on _Text Processing in Python_ :-).

...

The proposal just continues to seem far too magical to me. In the

training I now do for Continuum Analytics (I'm in charge of the training program with one other person), I specifically have a (very) little bit of the lessons where I mention something like:

...

print("{foo} is {bar}".format(**locals()))

But I give that entirely as a negative example of abusing code and

introducing fragility. f-strings are really the same thing, only even more error-prone and easier to get wrong. Relying on implicit context of the runtime state of variables that are merely in scope feels very break-y to me still. If I had to teach f-strings in the future, I'd teach it as a Python wart. My editor matches \bsym\b, but not locals() or "{sym}"; when I press *. #traceability

...

That said, there *is* one small corner where I believe f-strings add

something helpful to the language. There is no really concise way to spell:

...

collections.ChainMap(locals(), globals(), __builtins__.__dict__).

If we could spell that as, say `lgb()`, that would let str.format() or

%-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea. +1. This would be the explicit way to be loose with variable scope and string interpolation, while maintaining grep-ability.

...

Yours, David...

-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

...

Barry Warsaw

10 Aug 10 Aug

4:48 a.m.

On Aug 09, 2015, at 06:14 PM, David Mertz wrote:

...

That said, there *is* one small corner where I believe f-strings add something helpful to the language. There is no really concise way to spell:

collections.ChainMap(locals(), globals(), __builtins__.__dict__).

If we could spell that as, say `lgb()`, that would let str.format() or %-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea.

That would certainly be useful to avoid sys._getframe() calls in my library, although I'd probably want the third argument to be optional (I wouldn't use it). If '{foo}' or '${foo}' syntax is adopted (with no allowance for '$foo'), it's very unlikely I'd use that over string.Template for internationalization, but the above would still be useful. Cheers, -Barry

Steven D'Aprano

10:07 a.m.

On Sun, Aug 09, 2015 at 06:14:18PM -0700, David Mertz wrote: [...]

...

That said, there *is* one small corner where I believe f-strings add something helpful to the language. There is no really concise way to spell:

collections.ChainMap(locals(), globals(), __builtins__.__dict__).

I think that to match the normal name resolution rules, nonlocals() needs to slip in there between locals() and globals(). I realise that there actually isn't a nonlocals() function (perhaps there should be?).

...

If we could spell that as, say `lgb()`, that would let str.format() or %-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea.

I like the concept, but not the name. Initialisms tend to be hard to remember and rarely self-explanatory. How about scope()? -- Steve

Eric V. Smith

10:18 a.m.

On 08/10/2015 01:07 PM, Steven D'Aprano wrote:

...

On Sun, Aug 09, 2015 at 06:14:18PM -0700, David Mertz wrote:

[...]

...
That said, there *is* one small corner where I believe f-strings add something helpful to the language. There is no really concise way to spell:

collections.ChainMap(locals(), globals(), __builtins__.__dict__).

I think that to match the normal name resolution rules, nonlocals() needs to slip in there between locals() and globals(). I realise that there actually isn't a nonlocals() function (perhaps there should be?).

...
If we could spell that as, say `lgb()`, that would let str.format() or %-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea.

I like the concept, but not the name. Initialisms tend to be hard to remember and rarely self-explanatory. How about scope()?

I don't see how you're going to be able to do this in the general case. Not all variables end up in locals(). See PEP-498's discussion of closures, for example. Guido has already said locals() and globals() would not be part of the solution for string interpolation (also in the PEP). PEP-498 handles the non-general case: it parses through the string to find the variables used in the expressions, and then adds them to the symbol table. Eric.

David Mertz

11:52 a.m.

I know. I elided including the nonexistent `nonlocals()` in there. But it *should* be `lngb()`. Or call it scope(). :-) On Aug 10, 2015 10:09 AM, "Steven D'Aprano" <steve@pearwood.info> wrote:

...

On Sun, Aug 09, 2015 at 06:14:18PM -0700, David Mertz wrote:

[...]

...
That said, there *is* one small corner where I believe f-strings add something helpful to the language. There is no really concise way to spell:

collections.ChainMap(locals(), globals(), __builtins__.__dict__).

I think that to match the normal name resolution rules, nonlocals() needs to slip in there between locals() and globals(). I realise that there actually isn't a nonlocals() function (perhaps there should be?).

...
If we could spell that as, say `lgb()`, that would let str.format() or %-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea.

I like the concept, but not the name. Initialisms tend to be hard to remember and rarely self-explanatory. How about scope()?

-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx

Wes Turner

7:33 p.m.

On Mon, Aug 10, 2015 at 1:52 PM, David Mertz <mertz@gnosis.cx> wrote:

...

I know. I elided including the nonexistent `nonlocals()` in there. But it *should* be `lngb()`. Or call it scope(). :-) On Aug 10, 2015 10:09 AM, "Steven D'Aprano" <steve@pearwood.info> wrote:

...
On Sun, Aug 09, 2015 at 06:14:18PM -0700, David Mertz wrote:

[...]

...
That said, there *is* one small corner where I believe f-strings add something helpful to the language. There is no really concise way to spell:

collections.ChainMap(locals(), globals(), __builtins__.__dict__).

I think that to match the normal name resolution rules, nonlocals() needs to slip in there between locals() and globals(). I realise that there actually isn't a nonlocals() function (perhaps there should be?).

...
If we could spell that as, say `lgb()`, that would let str.format() or %-formatting pick up the full "what's in scope". To my mind, that's the only good thing about the f-string idea.

I like the concept, but not the name. Initialisms tend to be hard to remember and rarely self-explanatory. How about scope()?

#letsgoblues! scope(**kwargs), lngb(**kwargs), lookup(**kwargs) could allow for local attr override.

...

...
-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

Yury Selivanov

8:42 a.m.

Eric, On 2015-08-07 9:39 PM, Eric V. Smith wrote: [..]

...

'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

Congrats for the PEP, it's a cool concept! Overall I'm +1, because a lot of my own formatting code looks like this: 'something ... {var1} .. something ... {var2}'.format( var1=var1, var2=var2) However, I'm still -1 on a few things. 1. Naming. How about renaming f-strings to i-strings (short for interpolated, and, maybe, later for i18n-ed)? So instead of f'...' we will have i'...'. There is a parallel PEP 501 by Nick Coghlan proposing integrating translation mechanisms, and I think, that "i-" prefix would allow us to implement PEP 498 first, and later build upon it. And, to my ears, "i-string" sounds way better than "f-string". 2. I'm still not sold on allowing arbitrary expressions in strings. There is something about this idea that conflicts with Python philosophy and its principles. Supporting arbitrary expressions means that we give a blessing to shifting parts of application business logic to string formatting. I'd hate to see code like this: print(f'blah blah {self.foobar(spam="ham")!r} blah') to me it seems completely unreadable, and should be refactored to result = self.foobar(spam="ham") print(f'blah blah {result!r} blah') The refactored snippet of code is readable even without advanced syntax highlighting. Moreover, if we decide to implement Nick's PEP 501, then supporting expressions in f-strings will cause more harm than good, as translators usually aren't programmers. I think that the main reason behind allowing arbitrary expressions in f-strings is allowing attribute and item access: f'{foo.bar} {spam["ham"]}' If that's the case, then can we just restrict expressions allowed in f-strings to names, attribute and item lookups? And if later, there is a strong demand for full expressions, we can add them in 3.7? Thanks, Yury

Mike Miller

1:12 p.m.

Here are my notes on PEP 498. 1. Title: Literal String Formatting - String Literal Formatting - Format String Expressions ? 2. Let's call them "format strings" not "f-strings". The latter sounds slightly obnoxious, and also inconsistent with the others: r'' raw string u'' unicode object (string) f'' format string 3. " This PEP does not propose to remove or deprecate any of the existing string formatting mechanisms. " Should we put this farther up with the section talking about them, it seems out of place where it is. 4. "The existing ways of formatting are either error prone, inflexible, or cumbersome." I would tone this down a bit, they're not so bad, quite verbose is a phrase I might use instead. 5. Discussion Section How to designate f-strings, and how specify the locaton of expressions ^ typo 6. Perhaps mention string literal functionality, like triple quotes, line-ending backslashes, as MRAB mentions, in addition to the concatenation rules. -Mike On 08/07/2015 06:39 PM, Eric V. Smith wrote:

Eric V. Smith

11 Aug 11 Aug

6:47 a.m.

On 08/10/2015 04:12 PM, Mike Miller wrote:

...

Here are my notes on PEP 498.

1. Title: Literal String Formatting

- String Literal Formatting - Format String Expressions ?

I like "String Literal Formatting", but let me sleep on it.

...

2. Let's call them "format strings" not "f-strings". The latter sounds slightly obnoxious, and also inconsistent with the others:

r'' raw string u'' unicode object (string) f'' format string

People seem to have already started using f-strings. I think it's inevitable.

...

3. " This PEP does not propose to remove or deprecate any of the existing string formatting mechanisms. "

Should we put this farther up with the section talking about them, it seems out of place where it is.

Done.

...

4. "The existing ways of formatting are either error prone, inflexible, or cumbersome."

I would tone this down a bit, they're not so bad, quite verbose is a phrase I might use instead.

I'll try and tone it down.

...

5. Discussion Section How to designate f-strings, and how specify the locaton of expressions ^ typo

I already found that one. Thanks.

...

6. Perhaps mention string literal functionality, like triple quotes, line-ending backslashes, as MRAB mentions, in addition to the concatenation rules.

Good idea. Eric.

...

-Mike

On 08/07/2015 06:39 PM, Eric V. Smith wrote: _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40tru...

Mike Miller

10:05 a.m.

On 08/11/2015 06:47 AM, Eric V. Smith wrote:

...

...
2. Let's call them "format strings" not "f-strings". The latter sounds slightly obnoxious, and also inconsistent with the others:

r'' raw string u'' unicode object (string) f'' format string

People seem to have already started using f-strings. I think it's inevitable.

Sure, there's no way to ban it, that would be silly. But, I think the documentation should not use it. We don't normally say "r-strings" or "u-strings" when talking about them, it's not very accurate. The letter they use isn't their important quality. Also, avoiding the f- takes the spotlight off the part where f stands for words besides format. ;) -Mike

Paul Moore

16 Aug 16 Aug

7:21 a.m.

On 8 August 2015 at 02:39, Eric V. Smith <eric@trueblade.com> wrote:

...

Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

In my view: 1. Calling them "format strings" rather than "f-strings" is sensible (by analogy with "raw string" etc). Colloquially we can use f-string if we want, but let's have the formal name be fully spelled out. In particular, the PEP should use "format string". 2. By far and away the most common use for me would be things like print(f"Iteration {n}: Took {end-start) seconds"). At the moment I use str,format() for this, and it's annoyingly verbose. This would be a big win, and I'm +1 on the PEP for this specific reason. 3. All of the complex examples look scary, but in practice I wouldn't write stuff like that - why would anyone do so unless they were being deliberately obscure? On the other hand, as I gained experience with the construct, being *able* to use more complex expressions without having to stop and remember special cases would be great. 4. It's easy to write print("My age is {age}") and forget the "f" prefix. While it'll bug me at first that I have to go back and fix stuff to add the "f" after my code gives the wrong output, I *don't* want to see this ability added to unprefixed strings. IMO that's going a step too far (explicit is better than implicit and all that). 5. The PEP is silent (as far as I can see) on things like whether triple quoting (f"""...""") is allowed (I assume it is), and whether prefixes can be combined (for example, rf'{drive}:\{path}\{filename}') (I'd like them to be, but can live without it). 6. The justification for ignoring whitespace is weak (the motivating case is obscure, and there are many viable workarounds). I don't think it's worth ignoring whitespace - but I also don't think it's worth a long discussion. Just pick an option (as you did) and go with it. So I see no need for change here, Apologies for the above being terse - I'm clearing a big backlog of emails. Ask for clarification if you need it! Paul

Eric V. Smith

10:55 a.m.

Thanks, Paul. Good feedback. Triple quoted and raw strings work like you'd expect, but you're right: the PEP should make this clear. I might drop the leading spaces, for a technical reason having to deal with passing the strings in to str.format. But I agree it's not a big deal one way or the other. I'll incorporate the rest of your feedback (and others) when I get back to a real computer. -- Eric. Top-posted from my phone.

...

On Aug 16, 2015, at 10:21 AM, Paul Moore <p.f.moore@gmail.com> wrote:

...
On 8 August 2015 at 02:39, Eric V. Smith <eric@trueblade.com> wrote: Following a long discussion on python-ideas, I've posted my draft of PEP-498. It describes the "f-string" approach that was the subject of the "Briefer string format" thread. I'm open to a better title than "Literal String Formatting".

I need to add some text to the discussion section, but I think it's in reasonable shape. I have a fully working implementation that I'll get around to posting somewhere this weekend.

...
...
...
def how_awesome(): return 'very' ... f'f-strings are {how_awesome()} awesome!' 'f-strings are very awesome!'

I'm open to any suggestions to improve the PEP. Thanks for your feedback.

In my view:

1. Calling them "format strings" rather than "f-strings" is sensible (by analogy with "raw string" etc). Colloquially we can use f-string if we want, but let's have the formal name be fully spelled out. In particular, the PEP should use "format string".

2. By far and away the most common use for me would be things like print(f"Iteration {n}: Took {end-start) seconds"). At the moment I use str,format() for this, and it's annoyingly verbose. This would be a big win, and I'm +1 on the PEP for this specific reason.

3. All of the complex examples look scary, but in practice I wouldn't write stuff like that - why would anyone do so unless they were being deliberately obscure? On the other hand, as I gained experience with the construct, being *able* to use more complex expressions without having to stop and remember special cases would be great.

4. It's easy to write print("My age is {age}") and forget the "f" prefix. While it'll bug me at first that I have to go back and fix stuff to add the "f" after my code gives the wrong output, I *don't* want to see this ability added to unprefixed strings. IMO that's going a step too far (explicit is better than implicit and all that).

5. The PEP is silent (as far as I can see) on things like whether triple quoting (f"""...""") is allowed (I assume it is), and whether prefixes can be combined (for example, rf'{drive}:\{path}\{filename}') (I'd like them to be, but can live without it).

6. The justification for ignoring whitespace is weak (the motivating case is obscure, and there are many viable workarounds). I don't think it's worth ignoring whitespace - but I also don't think it's worth a long discussion. Just pick an option (as you did) and go with it. So I see no need for change here,

Apologies for the above being terse - I'm clearing a big backlog of emails. Ask for clarification if you need it!

Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40tru...

Guido van Rossum

12:37 p.m.

On Sun, Aug 16, 2015 at 8:55 PM, Eric V. Smith <eric@trueblade.com> wrote:

...

Thanks, Paul. Good feedback.

Indeed, I smiled when I saw Paul's post.

...

Triple quoted and raw strings work like you'd expect, but you're right: the PEP should make this clear.

I might drop the leading spaces, for a technical reason having to deal with passing the strings in to str.format. But I agree it's not a big deal one way or the other.

Hm. I rather like allow optional leading/trailing spaces. Given that we support arbitrary expressions, we have to support internal spaces; I think that some people would really like to use leading/trailing spaces, especially when there's text immediately against the other side of the braces, as in f'Stuff{ len(self.busy) }more stuff' I also expect it might be useful to allow leading/trailing newlines, if they are allowed at all (i.e. inside triple-quoted strings). E.g. f'''Stuff{ len(self.busy) }more stuff'''

...

I'll incorporate the rest of your feedback (and others) when I get back to a real computer.

Here's another thing for everybody's pondering: when tokenizing an f-string, I think the pieces could each become tokens in their own right. Then the rest of the parsing (and rules about whitespace etc.) would become simpler because the grammar would deal with them. E.g. the string above would be tokenized as follows: f'Stuff{ len ( self . busy ) }more stuff' The understanding here is that there are these new types of tokens: F_STRING_OPEN for f'...{, F_STRING_MIDDLE for }...{, F_STRING_END for }...', and I suppose we also need F_STRING_OPEN_CLOSE for f'...' (i.e. not containing any substitutions). These token types can then be used in the grammar. (A complication would be different kinds of string quotes; I propose to handle that in the lexer, otherwise the number of open/close token types would balloon out of proportions.) -- --Guido van Rossum (python.org/~guido)

Eric V. Smith

17 Aug 17 Aug

7:13 a.m.

On 08/16/2015 03:37 PM, Guido van Rossum wrote:

...

On Sun, Aug 16, 2015 at 8:55 PM, Eric V. Smith <eric@trueblade.com <mailto:eric@trueblade.com>> wrote:

Thanks, Paul. Good feedback.

Indeed, I smiled when I saw Paul's post.

Triple quoted and raw strings work like you'd expect, but you're right: the PEP should make this clear.

I might drop the leading spaces, for a technical reason having to deal with passing the strings in to str.format. But I agree it's not a big deal one way or the other.

Hm. I rather like allow optional leading/trailing spaces. Given that we support arbitrary expressions, we have to support internal spaces; I think that some people would really like to use leading/trailing spaces, especially when there's text immediately against the other side of the braces, as in

f'Stuff{ len(self.busy) }more stuff'

I also expect it might be useful to allow leading/trailing newlines, if they are allowed at all (i.e. inside triple-quoted strings). E.g.

f'''Stuff{ len(self.busy) }more stuff'''

Okay, I'm sold. This works in my current implementation:

...

...
...
f'''foo ... { 3 } ... bar''' 'foo\n3\nbar'

And since this currently works, there's no implementation specific reason to disallow leading and trailing whitespace:

...

...
...
'\n{\n3 + \n 1\t\n}\n'.format_map({'\n3 + \n 1\t\n':4}) '\n4\n'

My current plan is to replace an f-string with a call to .format_map:

...

...
...
foo = 100 bar = 20 f'foo: {foo} bar: { bar+1}'

Would become: 'foo: {foo} bar: { bar+1}'.format_map({'foo': 100, ' bar+1': 21}) The string on which format_map is called is the identical string that's in the source code. With the exception noted in PEP 498, I think this satisfies the principle of least surprise. As I've said elsewhere, we could then have some i18n function look up and replace the string before format_map is called on it. As long as it leaves the expression text alone, everything will work out fine. There are some quirks with having the same expression appear twice, if the expression has side effects. But I'm not so worried about that.

...

Here's another thing for everybody's pondering: when tokenizing an f-string, I think the pieces could each become tokens in their own right. Then the rest of the parsing (and rules about whitespace etc.) would become simpler because the grammar would deal with them. E.g. the string above would be tokenized as follows:

f'Stuff{ len ( self . busy ) }more stuff'

The understanding here is that there are these new types of tokens: F_STRING_OPEN for f'...{, F_STRING_MIDDLE for }...{, F_STRING_END for }...', and I suppose we also need F_STRING_OPEN_CLOSE for f'...' (i.e. not containing any substitutions). These token types can then be used in the grammar. (A complication would be different kinds of string quotes; I propose to handle that in the lexer, otherwise the number of open/close token types would balloon out of proportions.)

This would save a few hundred lines of C code. But a quick glance at the lexer and I can't see how to make the opening quotes agree with the closing quotes. I think the i18n case (if we chose to support it) is better served by having the entire, unaltered source string available at run time. PEP 501 comes to a similar conclusion (http://legacy.python.org/dev/peps/pep-0501/#preserving-the-unmodified-format...). Eric.

Guido van Rossum

11:24 a.m.

On Mon, Aug 17, 2015 at 7:13 AM, Eric V. Smith <eric@trueblade.com> wrote:

...

[...] My current plan is to replace an f-string with a call to .format_map:

...
...
...
foo = 100 bar = 20 f'foo: {foo} bar: { bar+1}'

Would become: 'foo: {foo} bar: { bar+1}'.format_map({'foo': 100, ' bar+1': 21})

The string on which format_map is called is the identical string that's in the source code. With the exception noted in PEP 498, I think this satisfies the principle of least surprise.

Does this really work? Shouldn't this be using some internal variant of format_map() that doesn't attempt to interpret the keys in brackets in any ways? Otherwise there'd be problems with the different meaning of e.g. {a[x]} (unless I misunderstand .format_map() -- I'm assuming it's just like .format(**blah).

...

As I've said elsewhere, we could then have some i18n function look up and replace the string before format_map is called on it. As long as it leaves the expression text alone, everything will work out fine. There are some quirks with having the same expression appear twice, if the expression has side effects. But I'm not so worried about that.

The more I hear Barry's objections against arbitrary expressions from the i18n POV the more I am thinking that this is just a square peg and a round hole situation, and we should leave i18n alone. The requirements for i18n are just too different than the requirements for other use cases (i18n cares deeply about preserving the original text of the {...} interpolations; the opposite is the case for the other use cases).

...

[...]

...
The understanding here is that there are these new types of tokens: F_STRING_OPEN for f'...{, F_STRING_MIDDLE for }...{, F_STRING_END for }...', and I suppose we also need F_STRING_OPEN_CLOSE for f'...' (i.e. not containing any substitutions). These token types can then be used in the grammar. (A complication would be different kinds of string quotes; I propose to handle that in the lexer, otherwise the number of open/close token types would balloon out of proportions.)

This would save a few hundred lines of C code. But a quick glance at the lexer and I can't see how to make the opening quotes agree with the closing quotes.

The lexer would have to develop another stack for this purpose.

...

I think the i18n case (if we chose to support it) is better served by having the entire, unaltered source string available at run time. PEP 501 comes to a similar conclusion ( http://legacy.python.org/dev/peps/pep-0501/#preserving-the-unmodified-format... ).

Fair enough. -- --Guido van Rossum (python.org/~guido)

Eric V. Smith

1:26 p.m.

On 8/17/2015 2:24 PM, Guido van Rossum wrote:

...

On Mon, Aug 17, 2015 at 7:13 AM, Eric V. Smith <eric@trueblade.com <mailto:eric@trueblade.com>> wrote:

[...] My current plan is to replace an f-string with a call to .format_map: >>> foo = 100 >>> bar = 20 >>> f'foo: {foo} bar: { bar+1}'

Would become: 'foo: {foo} bar: { bar+1}'.format_map({'foo': 100, ' bar+1': 21})

The string on which format_map is called is the identical string that's in the source code. With the exception noted in PEP 498, I think this satisfies the principle of least surprise.

Does this really work? Shouldn't this be using some internal variant of format_map() that doesn't attempt to interpret the keys in brackets in any ways? Otherwise there'd be problems with the different meaning of e.g. {a[x]} (unless I misunderstand .format_map() -- I'm assuming it's just like .format(**blah).

Good point. It will require a similar function to format_map which doesn't interpret the contents of the braces (except to the extent that the f-string parser already has to). For argument's sake in point #4 below, let's call this str.format_map_simple.

...

As I've said elsewhere, we could then have some i18n function look up and replace the string before format_map is called on it. As long as it leaves the expression text alone, everything will work out fine. There are some quirks with having the same expression appear twice, if the expression has side effects. But I'm not so worried about that.

The more I hear Barry's objections against arbitrary expressions from the i18n POV the more I am thinking that this is just a square peg and a round hole situation, and we should leave i18n alone. The requirements for i18n are just too different than the requirements for other use cases (i18n cares deeply about preserving the original text of the {...} interpolations; the opposite is the case for the other use cases).

I think it would be possible to create a version of this that works for both i18n and regular interpolation. I think the open issues are: 1. Barry wants the substitutions to look like $identifier and possibly ${identifier}, and the PEP 498 proposal just uses {}. 2. There needs to be a way to identify interpolated strings and i18n strings, and possibly combinations of those. This leads to PEP 501's i- and iu- strings. 3. A way to enforce identifiers-only, instead of generalized expressions. 4. We need a "safe substitution" mode for str.format_map_simple (from above). #1 is just a matter of preference: there's no technical reason to prefer {} over $ or ${}. We can make any decision here. I prefer {} because it's the same as str.format. #2 needs to be decided in concert with the tooling needed to extract the strings from the source code. The particular prefixes are up for debate. I'm not a big fan of using "u" to have a meaning different from it's current "do nothing" interpretation in 3.5. But really any prefixes will do, if we decide to use string prefixes. I think that's the question: do we want to distinguish among these cases using string prefixes or combinations thereof? #3 is doable, either at runtime or in the tooling that does the string extraction. #4 is simple, as long as we always turn it on for the localized strings. Personally I can go either way on including i18n. But I agree it's beginning to sound like i18n is just too complicated for PEP 498, and I think PEP 501 is already too complicated. I'd like to make a decision on this one way or the other, so we can move forward.

...

[...] > The understanding here is that there are these new types of tokens: > F_STRING_OPEN for f'...{, F_STRING_MIDDLE for }...{, F_STRING_END for > }...', and I suppose we also need F_STRING_OPEN_CLOSE for f'...' (i.e. > not containing any substitutions). These token types can then be used in > the grammar. (A complication would be different kinds of string quotes; > I propose to handle that in the lexer, otherwise the number of > open/close token types would balloon out of proportions.)

This would save a few hundred lines of C code. But a quick glance at the lexer and I can't see how to make the opening quotes agree with the closing quotes.

The lexer would have to develop another stack for this purpose.

I'll give it some thought. Eric.

Guido van Rossum

1:36 p.m.

On Mon, Aug 17, 2015 at 1:26 PM, Eric V. Smith <eric@trueblade.com> wrote:

...

[...] I think it would be possible to create a version of this that works for both i18n and regular interpolation. I think the open issues are:

1. Barry wants the substitutions to look like $identifier and possibly ${identifier}, and the PEP 498 proposal just uses {}.

2. There needs to be a way to identify interpolated strings and i18n strings, and possibly combinations of those. This leads to PEP 501's i- and iu- strings.

3. A way to enforce identifiers-only, instead of generalized expressions.

In an off-list message to Barry and Nick I came up with the same three points. :-) I think #2 is the hard one (unless we adopt a solution like Yury just proposed where you can have an arbitrary identifier in front of a string literal).

...

4. We need a "safe substitution" mode for str.format_map_simple (from above).

#1 is just a matter of preference: there's no technical reason to prefer {} over $ or ${}. We can make any decision here. I prefer {} because it's the same as str.format.

#2 needs to be decided in concert with the tooling needed to extract the strings from the source code. The particular prefixes are up for debate. I'm not a big fan of using "u" to have a meaning different from it's current "do nothing" interpretation in 3.5. But really any prefixes will do, if we decide to use string prefixes. I think that's the question: do we want to distinguish among these cases using string prefixes or combinations thereof?

#3 is doable, either at runtime or in the tooling that does the string extraction.

#4 is simple, as long as we always turn it on for the localized strings.

Personally I can go either way on including i18n. But I agree it's beginning to sound like i18n is just too complicated for PEP 498, and I think PEP 501 is already too complicated. I'd like to make a decision on this one way or the other, so we can move forward.

What's the rush? There's plenty of time before Python 3.6.

...

...
[...] > The understanding here is that there are these new types of tokens: > F_STRING_OPEN for f'...{, F_STRING_MIDDLE for }...{, F_STRING_END

for

...
> }...', and I suppose we also need F_STRING_OPEN_CLOSE for f'...'

(i.e.

...
> not containing any substitutions). These token types can then be

used in

...
> the grammar. (A complication would be different kinds of string

quotes;

...
> I propose to handle that in the lexer, otherwise the number of > open/close token types would balloon out of proportions.)

This would save a few hundred lines of C code. But a quick glance at

the

...
lexer and I can't see how to make the opening quotes agree with the closing quotes.

The lexer would have to develop another stack for this purpose.

I'll give it some thought.

Eric.

-- --Guido van Rossum (python.org/~guido)

Barry Warsaw

21 Aug 21 Aug

8:19 a.m.

New subject: Compiler hints to control how f-strings are construed

On Aug 17, 2015, at 01:36 PM, Guido van Rossum wrote:

...

...
1. Barry wants the substitutions to look like $identifier and possibly ${identifier}, and the PEP 498 proposal just uses {}.

2. There needs to be a way to identify interpolated strings and i18n strings, and possibly combinations of those. This leads to PEP 501's i- and iu- strings.

3. A way to enforce identifiers-only, instead of generalized expressions.

In an off-list message to Barry and Nick I came up with the same three points. :-)

I think #2 is the hard one (unless we adopt a solution like Yury just proposed where you can have an arbitrary identifier in front of a string literal).

I've been heads-down on other things for a little while, but trying to re-engage on this thread. One thing that occurs to me now regarding points #1 and #3 is that, if we had a way to signal to the compiler how we wanted f-strings (to use a helpful shorthand) to be parsed, we could solve both problems and make the feature more useful for i18n. I'm thinking something along the lines of __future__ imports, which already influence how code in a module is construed. If we had a similar way to hint that f-strings should be construed in a way other than the default, I could do something like: from __string__ import f_strings_as_i18n at the top of my module, and that would slot in the parser for PEP 292 strings and no-arbitrary expressions. I'd be fine with that. There are some downsides of course. I wouldn't be able to mix my simpler, i18n-based strings with the default full-featured PEP 498/501 strings in the same module. I can live with that. I don't see something like a context manager being appropriate for that use case because it's a run-time behavior, even if the syntax would look convenient. The hints inside the __string__ module wouldn't be extensible, except by modifying Python's stdlib. E.g. if you wanted $-strings but full expression support, we'd have to write and distribute that with stdlib. I'm also fine with this because I think there aren't really *that* many different use cases. (There's still #2 but let's deal with that later.)

...

...
4. We need a "safe substitution" mode for str.format_map_simple (from above).

Again, a `from __string__` import could solve that, right? Cheers, -Barry

Victor Stinner

16 Aug 16 Aug

9:34 p.m.

2015-08-16 7:21 GMT-07:00 Paul Moore <p.f.moore@gmail.com>:

...

2. By far and away the most common use for me would be things like print(f"Iteration {n}: Took {end-start) seconds"). At the moment I use str,format() for this, and it's annoyingly verbose. This would be a big win, and I'm +1 on the PEP for this specific reason.

You can use a temporary variable, it's not much longer: print("Iteration {n}: Took {dt) seconds".format(n=n, dt=end-start)) becomes dt = end - start print(f"Iteration {n}: Took {dt) seconds")

...

3. All of the complex examples look scary, but in practice I wouldn't write stuff like that - why would anyone do so unless they were being deliberately obscure?

I'm quite sure that users will write complex code in f-strings. I vote -1 on the current PEP because of the support of Python code in f-string, but +1 on a PEP without Python code. Victor

Paul Moore

17 Aug 17 Aug

3:02 a.m.

On 17 August 2015 at 05:34, Victor Stinner <victor.stinner@gmail.com> wrote:

...

2015-08-16 7:21 GMT-07:00 Paul Moore <p.f.moore@gmail.com>:

...
2. By far and away the most common use for me would be things like print(f"Iteration {n}: Took {end-start) seconds"). At the moment I use str,format() for this, and it's annoyingly verbose. This would be a big win, and I'm +1 on the PEP for this specific reason.

You can use a temporary variable, it's not much longer: print("Iteration {n}: Took {dt) seconds".format(n=n, dt=end-start)) becomes dt = end - start print(f"Iteration {n}: Took {dt) seconds")

... which is significantly shorter (my point). And using an inline expression print(f"Iteration {n}: Took {end-start) seconds") with (IMO) even better readability than the version with a temporary variable.

...

...
3. All of the complex examples look scary, but in practice I wouldn't write stuff like that - why would anyone do so unless they were being deliberately obscure?

I'm quite sure that users will write complex code in f-strings.

So am I. Some people will always write bad code. I won't (or at least, I'll try not to write code that *I* consider to be complex :-)) but "you can use this construct to write bad code" isn't an argument for dropping the feature. If you couldn't find *good* uses, that would be different, but that doesn't seem to be the case here (at least in my view). Paul.

Larry Hastings

4:48 a.m.

On 08/17/2015 03:02 AM, Paul Moore wrote:

...

On 17 August 2015 at 05:34, Victor Stinner <victor.stinner@gmail.com> wrote:

...
2015-08-16 7:21 GMT-07:00 Paul Moore <p.f.moore@gmail.com>:

...
3. All of the complex examples look scary, but in practice I wouldn't write stuff like that - why would anyone do so unless they were being deliberately obscure? I'm quite sure that users will write complex code in f-strings. So am I. Some people will always write bad code. I won't (or at least, I'll try not to write code that *I* consider to be complex :-)) but "you can use this construct to write bad code" isn't an argument for dropping the feature. If you couldn't find *good* uses, that would be different, but that doesn't seem to be the case here (at least in my view).

I think this corner of the debate is covered by the "Consenting adults" guiding principle we use 'round these parts. Cheers, //arry/

Paul Moore

4:59 a.m.

On 17 August 2015 at 12:48, Larry Hastings <larry@hastings.org> wrote:

...

I think this corner of the debate is covered by the "Consenting adults" guiding principle we use 'round these parts.

Precisely. Paul

Barry Warsaw

7:50 a.m.

On Aug 17, 2015, at 11:02 AM, Paul Moore wrote:

...

print(f"Iteration {n}: Took {end-start) seconds")

This illustrates (more) problems I have with arbitrary expressions. First, you've actually made a typo there; it should be "{end-start}" -- notice the trailing curly brace. Second, what if you typoed that as "{end_start}"? According to PEP 498 the original typo above should trigger a SyntaxError and the second a run-time error (NameError?). But how will syntax highlighters and linters help you discover your bugs before you've even saved the file? Currently, a lot of these types of problems can be found much earlier on through the use of such linters. Putting arbitrary expressions in strings will just hide them to these tools for the foreseeable future. I have a hard time seeing how Emacs's syntax highlighting could cope with it for example. Cheers, -Barry

Chris Angelico

7:58 a.m.

On Tue, Aug 18, 2015 at 12:50 AM, Barry Warsaw <barry@python.org> wrote:

...

On Aug 17, 2015, at 11:02 AM, Paul Moore wrote:

...
print(f"Iteration {n}: Took {end-start) seconds")

This illustrates (more) problems I have with arbitrary expressions.

First, you've actually made a typo there; it should be "{end-start}" -- notice the trailing curly brace. Second, what if you typoed that as "{end_start}"? According to PEP 498 the original typo above should trigger a SyntaxError and the second a run-time error (NameError?). But how will syntax highlighters and linters help you discover your bugs before you've even saved the file? Currently, a lot of these types of problems can be found much earlier on through the use of such linters. Putting arbitrary expressions in strings will just hide them to these tools for the foreseeable future. I have a hard time seeing how Emacs's syntax highlighting could cope with it for example.

The linters could tell you that you have no 'end' or 'start' just as easily when it's in that form as when it's written out in full. Certainly the mismatched brackets could easily be caught by any sort of syntax highlighter. The rules for f-strings are much simpler than, say, the PHP rules and the differences between ${...} and {$...}, which I've seen editors get wrong. ChrisA

Barry Warsaw

8:13 a.m.

On Aug 18, 2015, at 12:58 AM, Chris Angelico wrote:

...

The linters could tell you that you have no 'end' or 'start' just as easily when it's in that form as when it's written out in full. Certainly the mismatched brackets could easily be caught by any sort of syntax highlighter. The rules for f-strings are much simpler than, say, the PHP rules and the differences between ${...} and {$...}, which I've seen editors get wrong.

I'm really asking whether it's technically feasible and realistically possible for them to do so. I'd love to hear from the maintainers of pyflakes, pylint, Emacs, vim, and other editors, linters, and other static analyzers on a rough technical assessment of whether they can support this and how much work it would be. Cheers, -Barry

Guido van Rossum

11:31 a.m.

On Mon, Aug 17, 2015 at 8:13 AM, Barry Warsaw <barry@python.org> wrote:

...

I'm really asking whether it's technically feasible and realistically possible for them to do so. I'd love to hear from the maintainers of pyflakes, pylint, Emacs, vim, and other editors, linters, and other static analyzers on a rough technical assessment of whether they can support this and how much work it would be.

Those that aren't specific to Python will have to solve a similar problem for e.g. Swift, which supports \(...) in all strings with arbitrary expressions in the ..., or Perl which apparently also supports arbitrary expressions. Heck, even Bash supports something like this, "...$(command)...". I am not disinclined in adding some restrictions to make things a little more tractable, but they would be along the lines of the Swift restriction (the interpolated expression cannot contain string quotes). However, I do think we should support f"...{a['key']}...". -- --Guido van Rossum (python.org/~guido)

Steve Dower

3:06 p.m.

On 17Aug2015 0813, Barry Warsaw wrote:

...

On Aug 18, 2015, at 12:58 AM, Chris Angelico wrote:

...
The linters could tell you that you have no 'end' or 'start' just as easily when it's in that form as when it's written out in full. Certainly the mismatched brackets could easily be caught by any sort of syntax highlighter. The rules for f-strings are much simpler than, say, the PHP rules and the differences between ${...} and {$...}, which I've seen editors get wrong.

I'm really asking whether it's technically feasible and realistically possible for them to do so. I'd love to hear from the maintainers of pyflakes, pylint, Emacs, vim, and other editors, linters, and other static analyzers on a rough technical assessment of whether they can support this and how much work it would be.

With the current format string proposals (allowing arbitrary expressions) I think I'd implement it in our parser with a FORMAT_STRING_TOKEN, a FORMAT_STRING_JOIN_OPERATOR and a FORMAT_STRING_FORMAT_OPERATOR. A FORMAT_STRING_TOKEN would be started by f('|"|'''|""") and ended by matching quotes or before an open brace (that is not escaped). A FORMAT_STRING_JOIN_OPERATOR would be inserted as the '{', which we'd either colour as part of the string or the regular brace colour. This also enables a parsing context where a colon becomes the FORMAT_STRING_FORMAT_OPERATOR and the right-hand side of this binary operator would be FORMAT_STRING_TOKEN. The final close brace becomes another FORMAT_STRING_JOIN_OPERATOR and the rest of the string is FORMAT_STRING_TOKEN. So it'd translate something like this: f"This {text} is my {string:>{length+3}}" FORMAT_STRING_TOKEN[f"This ] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[text] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN[ is my ] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[string] FORMAT_STRING_FORMAT_OPERATOR[:] FORMAT_STRING_TOKEN[>] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[length] OPERATOR[+] NUMBER[3] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN[] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN["] I *believe* (without having tried it) that this would let us produce a valid tokenisation (in our model) without too much difficulty, and highlight/analyse correctly, including validating matching braces. Getting the precedence correct on the operators might be more difficult, but we may also just produce an AST that looks like a function call, since that will give us "good enough" handling once we're past tokenisation. A simpler tokenisation that would probably be sufficient for many editors would be to treat the first and last segments ([f"This {] and [}"]) as groupings and each section of text as separators, giving this: OPEN_GROUPING[f"This {] EXPRESSION[text] COMMA[} is my {] EXPRESSION[string] COMMA[:>{] EXPRESSION[length+3] COMMA[}}] CLOSE_GROUPING["] Initial parsing may be a little harder, but it should mean less trouble when expressions spread across multiple lines, since that is already handled for other types of groupings. And if any code analysis is occurring, it should be happening for dict/list/etc. contents already, and so format strings will get it too. So I'm confident we can support it, and I expect either of these two approaches will work for most tools without too much trouble. (There's also a middle ground where you create new tokens for format string components, but combine them like the second example.) Cheers, Steve

...

Cheers, -Barry

Steve Dower

4:08 p.m.

On 17Aug2015 1506, Steve Dower wrote:

...

On 17Aug2015 0813, Barry Warsaw wrote:

...
On Aug 18, 2015, at 12:58 AM, Chris Angelico wrote:

...
The linters could tell you that you have no 'end' or 'start' just as easily when it's in that form as when it's written out in full. Certainly the mismatched brackets could easily be caught by any sort of syntax highlighter. The rules for f-strings are much simpler than, say, the PHP rules and the differences between ${...} and {$...}, which I've seen editors get wrong.

I'm really asking whether it's technically feasible and realistically possible for them to do so. I'd love to hear from the maintainers of pyflakes, pylint, Emacs, vim, and other editors, linters, and other static analyzers on a rough technical assessment of whether they can support this and how much work it would be.

With the current format string proposals (allowing arbitrary expressions) I think I'd implement it in our parser with a FORMAT_STRING_TOKEN, a FORMAT_STRING_JOIN_OPERATOR and a FORMAT_STRING_FORMAT_OPERATOR.

A FORMAT_STRING_TOKEN would be started by f('|"|'''|""") and ended by matching quotes or before an open brace (that is not escaped).

A FORMAT_STRING_JOIN_OPERATOR would be inserted as the '{', which we'd either colour as part of the string or the regular brace colour. This also enables a parsing context where a colon becomes the FORMAT_STRING_FORMAT_OPERATOR and the right-hand side of this binary operator would be FORMAT_STRING_TOKEN. The final close brace becomes another FORMAT_STRING_JOIN_OPERATOR and the rest of the string is FORMAT_STRING_TOKEN.

So it'd translate something like this:

f"This {text} is my {string:>{length+3}}"

FORMAT_STRING_TOKEN[f"This ] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[text] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN[ is my ] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[string] FORMAT_STRING_FORMAT_OPERATOR[:] FORMAT_STRING_TOKEN[>] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[length] OPERATOR[+] NUMBER[3] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN[] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN["]

I *believe* (without having tried it) that this would let us produce a valid tokenisation (in our model) without too much difficulty, and highlight/analyse correctly, including validating matching braces. Getting the precedence correct on the operators might be more difficult, but we may also just produce an AST that looks like a function call, since that will give us "good enough" handling once we're past tokenisation.

A simpler tokenisation that would probably be sufficient for many editors would be to treat the first and last segments ([f"This {] and [}"]) as groupings and each section of text as separators, giving this:

OPEN_GROUPING[f"This {] EXPRESSION[text] COMMA[} is my {] EXPRESSION[string] COMMA[:>{] EXPRESSION[length+3] COMMA[}}] CLOSE_GROUPING["]

Initial parsing may be a little harder, but it should mean less trouble when expressions spread across multiple lines, since that is already handled for other types of groupings. And if any code analysis is occurring, it should be happening for dict/list/etc. contents already, and so format strings will get it too.

So I'm confident we can support it, and I expect either of these two approaches will work for most tools without too much trouble. (There's also a middle ground where you create new tokens for format string components, but combine them like the second example.)

The middle ground would probably be required for correct highlighting. I implied but didn't specify that the tokens in my second example would get special treatment here.

...

Cheers, Steve

...
Cheers, -Barry

MRAB

4:18 p.m.

On 2015-08-17 23:06, Steve Dower wrote:

...

On 17Aug2015 0813, Barry Warsaw wrote:

...
On Aug 18, 2015, at 12:58 AM, Chris Angelico wrote:

...
The linters could tell you that you have no 'end' or 'start' just as easily when it's in that form as when it's written out in full. Certainly the mismatched brackets could easily be caught by any sort of syntax highlighter. The rules for f-strings are much simpler than, say, the PHP rules and the differences between ${...} and {$...}, which I've seen editors get wrong.

I'm really asking whether it's technically feasible and realistically possible for them to do so. I'd love to hear from the maintainers of pyflakes, pylint, Emacs, vim, and other editors, linters, and other static analyzers on a rough technical assessment of whether they can support this and how much work it would be.

With the current format string proposals (allowing arbitrary expressions) I think I'd implement it in our parser with a FORMAT_STRING_TOKEN, a FORMAT_STRING_JOIN_OPERATOR and a FORMAT_STRING_FORMAT_OPERATOR.

A FORMAT_STRING_TOKEN would be started by f('|"|'''|""") and ended by matching quotes or before an open brace (that is not escaped).

A FORMAT_STRING_JOIN_OPERATOR would be inserted as the '{', which we'd either colour as part of the string or the regular brace colour. This also enables a parsing context where a colon becomes the FORMAT_STRING_FORMAT_OPERATOR and the right-hand side of this binary operator would be FORMAT_STRING_TOKEN. The final close brace becomes another FORMAT_STRING_JOIN_OPERATOR and the rest of the string is FORMAT_STRING_TOKEN.

So it'd translate something like this:

f"This {text} is my {string:>{length+3}}"

FORMAT_STRING_TOKEN[f"This ] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[text] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN[ is my ] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[string] FORMAT_STRING_FORMAT_OPERATOR[:] FORMAT_STRING_TOKEN[>] FORMAT_STRING_JOIN_OPERATOR[{] IDENTIFIER[length] OPERATOR[+] NUMBER[3] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN[] FORMAT_STRING_JOIN_OPERATOR[}] FORMAT_STRING_TOKEN["]

I'm not sure about that. I think it might work better with, say, FORMAT_OPEN for '{' and FORMAT_CLOSE for '}': FORMAT_STRING_TOKEN[f"This ] FORMAT_OPEN IDENTIFIER[text] FORMAT_CLOSE FORMAT_STRING_TOKEN[ is my ] FORMAT_OPEN IDENTIFIER[string] FORMAT_STRING_FORMAT_OPERATOR[:] FORMAT_STRING_TOKEN[>] FORMAT_OPEN IDENTIFIER[length] OPERATOR[+] NUMBER[3] FORMAT_CLOSE FORMAT_CLOSE FORMAT_STRING_TOKEN["]

...

I *believe* (without having tried it) that this would let us produce a valid tokenisation (in our model) without too much difficulty, and highlight/analyse correctly, including validating matching braces. Getting the precedence correct on the operators might be more difficult, but we may also just produce an AST that looks like a function call, since that will give us "good enough" handling once we're past tokenisation.

A simpler tokenisation that would probably be sufficient for many editors would be to treat the first and last segments ([f"This {] and [}"]) as groupings and each section of text as separators, giving this:

OPEN_GROUPING[f"This {] EXPRESSION[text] COMMA[} is my {] EXPRESSION[string] COMMA[:>{] EXPRESSION[length+3] COMMA[}}] CLOSE_GROUPING["]

Initial parsing may be a little harder, but it should mean less trouble when expressions spread across multiple lines, since that is already handled for other types of groupings. And if any code analysis is occurring, it should be happening for dict/list/etc. contents already, and so format strings will get it too.

So I'm confident we can support it, and I expect either of these two approaches will work for most tools without too much trouble. (There's also a middle ground where you create new tokens for format string components, but combine them like the second example.)

Cheers, Steve

...
Cheers, -Barry

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/python%40mrabarnett.plus....

Stephen J. Turnbull

6:28 p.m.

Barry Warsaw writes:

...

On Aug 17, 2015, at 11:02 AM, Paul Moore wrote:

...
print(f"Iteration {n}: Took {end-start) seconds")

This illustrates (more) problems I have with arbitrary expressions.

First, you've actually made a typo there; it should be "{end-start}" -- notice the trailing curly brace. Second, what if you typoed that as "{end_start}"? According to PEP 498 the original typo above should trigger a SyntaxError

That ship has sailed, you have the same problem with str.format format strings already.

...

and the second a run-time error (NameError?).

Ditto.

...

But how will syntax highlighters and linters help you discover your bugs before you've even saved the file?

They need to recognize that a string prefixed with "f" is special, that it's not just a single token, then parse the syntax. The hardest part is finding the end-of-string delimiter! The expression itself is not a problem, since either we already have the code to handle the expression, or we don't (and your whole point is moot). Emacs abandoned the idea that you should do syntax highlighting without parsing well over a decade ago. If Python can implement the syntax, Emacs can highlight it. It's just a question of if there's will to do it on the part of the python-mode maintainers. I'm sure the same can be said about other linters and highlighters for Python, though I have no part in implementing them.

Nikolaus Rath

12:23 p.m.

On Aug 16 2015, Paul Moore <p.f.moore@gmail.com> wrote:

...

2. By far and away the most common use for me would be things like print(f"Iteration {n}: Took {end-start) seconds").

I believe an even more common use willl be print(f"Iteration {n+1}: Took {end-start} seconds") Note that not allowing expressions would turn this into the rather verbose: iteration=n+1 duration=end-start print(f"Iteration {iteration}: Took {duration} seconds") Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

Guido van Rossum

12:46 p.m.

On Mon, Aug 17, 2015 at 12:23 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:

...

On Aug 16 2015, Paul Moore <p.f.moore@gmail.com> wrote:

...
2. By far and away the most common use for me would be things like print(f"Iteration {n}: Took {end-start) seconds").

I believe an even more common use willl be

print(f"Iteration {n+1}: Took {end-start} seconds")

Note that not allowing expressions would turn this into the rather verbose:

iteration=n+1 duration=end-start print(f"Iteration {iteration}: Took {duration} seconds")

Let's stop debating this point -- any acceptable solution will have to support (more-or-less) arbitrary expressions. *If* we end up also attempting to solve i18n, then it will be up to the i18n toolchain to require a stricter syntax. (I imagine this could be done during the string extraction phase.) -- --Guido van Rossum (python.org/~guido)

Wes Turner

5:57 p.m.

On Aug 17, 2015 2:23 PM, "Nikolaus Rath" <Nikolaus@rath.org> wrote:

...

On Aug 16 2015, Paul Moore <p.f.moore@gmail.com> wrote:

...
2. By far and away the most common use for me would be things like print(f"Iteration {n}: Took {end-start) seconds").

I believe an even more common use willl be

print(f"Iteration {n+1}: Took {end-start} seconds")

Note that not allowing expressions would turn this into the rather verbose:

iteration=n+1 duration=end-start print(f"Iteration {iteration}: Took {duration} seconds")

* Is this more testable? * mutability of e.g. end.__sub__ * (do I add this syntax highlighting for Python < 3.6?)

...

Best, -Nikolaus

-- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.« _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com

3324

Age (days ago)

3337

Last active (days ago)

List overview

Download

96 comments

30 participants

participants (30)

Alexander Walters
Barry Warsaw
Brett Cannon
Carl Meyer
Chris Angelico
David Mertz
Eric V. Smith
Ethan Furman
Greg Ewing
Gregory P. Smith
Guido van Rossum
ISAAC J SCHWABACHER
Larry Hastings
Mike Miller
MRAB
Nathaniel Smith
Nick Coghlan
Nikolaus Rath
Paul Moore
Peter Ludemann
Raymond Hettinger
Stefan Behnel
Stephen J. Turnbull
Steve Dower
Steven D'Aprano
Sven R. Kunze
Tim Delaney
Victor Stinner
Wes Turner
Yury Selivanov

PEP-498: Literal String Formatting

Peter Ludemann

Peter Ludemann

Peter Ludemann

tags

participants (30)