Subtle difference between f-strings and str.format()

There is a subtle semantic difference between str.format() and "equivalent" f-string. '{}{}'.format(a, b) f'{a}{b}' In the former case b is evaluated before formatting a. This is equivalent to t1 = a t2 = b t3 = format(t1) t4 = format(t2) r = t3 + t4 In the latter case a is formatted before evaluating b. This is equivalent to t1 = a t2 = format(t1) t3 = b t4 = format(t3) r = t2 + t4 In most cases this doesn't matter, but when implement the optimization that transforms the former expression to the the latter one ([1], [2]) we have to make a decision what to do with this difference. 1. Keep the exact semantic of str.format() when optimize it. This means that it should be transformed into AST node different from the AST node used for f-strings. Either introduce a new AST node type, or add a boolean flag to JoinedStr. 2. Change the semantic of f-strings. Make it closer to the semantic of str.format(): evaluate all subexpressions first than format them. This can be implemented in two ways: 2a) Add additional instructions for stack manipulations. This will slow down f-strings. 2b) Introduce a new complex opcode that will replace FORMAT_VALUE and BUILD_STRING. This will speed up f-strings. 3. Transform str.format() into an f-string with changing semantic, and ignore this change. This is not new. The optimizer already changes semantic. Non-optimized "if a and True:" would call bool(a) twice, but optimized code calls it only once. [1] https://bugs.python.org/issue28307 [2] https://bugs.python.org/issue28308

Hm, without thinking too much about it I'd say it's okay to change the evaluation order. Can these optimizations be disabled with something like -O0? On Wed, Mar 28, 2018 at 8:27 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

28.03.18 19:20, Guido van Rossum пише:
Hm, without thinking too much about it I'd say it's okay to change the evaluation order.
Do you mean the option 3, right? This is the simplest option. I have already wrote a PR for optimizing old-style formating [1], but have not merged it yet due to this change of semantic.
Can these optimizations be disabled with something like -O0?
Currently there is no way to disable optimizations. There is an open issue with a request for this. [2] [1] https://github.com/python/cpython/pull/5012 [2] https://bugs.python.org/issue2506

28.03.18 22:05, Paul Moore пише
If formatting a and evaluating b both raise exceptions, the resulting exception depends on the order. $ ./python -bb >>> a = b'bytes' >>> '{}{}'.format(a, b) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'b' is not defined >>> f'{a}{b}' Traceback (most recent call last): File "<stdin>", line 1, in <module> BytesWarning: str() on a bytes instance

[Serhiy Storchaka <storchaka@gmail.com>]
I have a hard time imaging how that could have come to be, but if it's true I'd say the unoptimized code was plain wrong. The dumbest possible way to implement `f() and g()` is also the correct ;-) way: result = f() if not bool(result): result = g() For the thing you really care about here, the language guarantees `a` will be evaluated before `b` in: '{}{}'.format(a, b) but I'm not sure it says anything about how the format operations are interleaved. So your proposed transformation is fine by me (your #3: still evaluate `a` before `b` but ignore that the format operations may occur in a different order with respect to those).

[Tim]
Ah, I think I see your point now. In the _context_ of `if f() and g()`, the dumbest possible code generation would do the above, and then go on to do if bool(result): .... If in fact `f()` returned a false-like value, an optimizer could note that `bool(result)` had already been evaluated and skip the redundant evaluation. I think that's fine either way: what the language guarantees is that `f()` will be evaluated exactly once, and `g()` no more than once, and that's all so regardless.

28.03.18 21:30, Tim Peters пише:
Optimized if f() and g(): spam() is equivalent to result = f() if bool(result): result = g() if bool(result): spam() Without optimization it would be equivalent to result = f() if bool(result): result = g() if bool(result): spam() It calls bool() for the result of f() twice if it is false. Thus there is a small difference between if f() and g(): spam() and tmp = f() and g() if tmp: spam()

Serhiy Storchaka schrieb am 28.03.2018 um 17:27:
I agree that it's not normally a problem, but if the formatting of 'a' fails and raises an exception, then 'b' will not get evaluated at all in the second case. Whether this difference is subtle or not is seems to depend largely on the code at hand. Stefan

On 29 March 2018 at 08:09, Tim Delaney <timothy.c.delaney@gmail.com> wrote:
Sorry about that - finger slipped and I sent an incomplete email ... If I'm not mistaken, #3 would result in the optimiser changing str.format() into an f-string in-place. Is this correct? We're not talking here about people manually changing the code from str.format() to f-strings, right? I would argue that any optimisation needs to have the same semantics as the original code - in this case, that all arguments are evaluated before the string is formatted. I also assumed (not having actually used an f-string) that all its formatting arguments were evaluated before formatting. So my preference would be (if my understanding in the first line is correct): 1: +0 2a: +0.5 2b: +1 3: -1 Tim Delaney

[Tim Delaney <timothy.c.delaney@gmail.com>]
All correct. It's a magical transformation from one spelling to another.
That's why Serhiy is asking about it - there _are_ potentially visible changes in behavior under all but one of his suggestions.
I also assumed (not having actually used an f-string) that all its formatting arguments were evaluated before formatting.
It's a string - it doesn't have "arguments" as such. For example: def f(a, b, n): return f"{a+b:0{n}b}" # the leading "f" makes it an f-string Then
f(2, 3, 12) '000000000101'
The generated code currently interleaves evaluating expressions with formatting the results in a more-than-less obvious way, waiting until the end to paste all the formatted fragments together. As shown in the example, this can be more than one level deep (the example needs to paste together "0", str(n), and "b" to _build_ the format code for `a+b`).
That's the only suggestion with no potentially visible changes. I'll add another: leave `.format()` alone entirely - there's no _need_ to "optimize" it, it's just a maybe-nice-to-have.
2a: +0.5 2b: +1
Those two don't change the behaviors of `.format()`, but _do_ change some end-case behaviors of f-strings. If you're overly ;-) concerned about the former, it would be consistent to be overly concerned about the latter too.
3: -1
And that's the one that proposes to let .format() also interleave expression evaluation (but still strictly "left to right") with formatting. If it were a general code transformation, I'm sure everyone would be -1. As is, it's hard to care. String formatting is a tiny area, and format methods are generally purely functional (no side effects). If anyone has a non-contrived example where the change would make a lick of real difference, they shouldn't be shy about posting it :-) I looked, and can't find any in my code.

My credentials for this are that I re-worked str.format in Jython quite extensively, and I followed the design of f-strings a bit when they were introduced, but I haven't used them to write anything. On 29/03/2018 00:48, Tim Peters wrote:
Agreed "argument" is the wrong word, but so is "string". It's an expression returning a string, in which a, b and n are free variables. I think we can understand it best as a string-display (https://docs.python.org/3/reference/expressions.html#list-displays), or a sort of eval() call. The difference Serhiy identifies emerges (I think) because in the conventional interpretation of a format call, the arguments of format are evaluated left-to right (all of them) and then formatted in the order references are encountered to these values in a tuple or dictionary. In an f-string expressions are evaluated as they are encountered. A more testing example is therefore perhaps: '{1} {0}'.format(a(), b()) # E1 f'{b()}{a()}' # E2 I think I would be very surprised to find b called before a in E1 because of the general contract on the meaning of method calls. I'm assuming that's what an AST-based optimisation would do? There's no reason in E2 to call them in any other order than b then a and the documentation tells me they are. But do I expect a() to be called before the results of b() are formatted? In E1 I definitely expect that. In E2 I don't think I'd be surprised either way. Forced to guess, I would guess that b() would be formatted and in the output buffer before a() was called, since it gives the implementation fewer things to remember. Then I hope I would not depend on this guesswork. Strictly-speaking the documentation doesn't say when the result is formatted in relation to the evaluation of other expressions, so there is permission for Serhiy's idea #2. I think the (internal) AST change implied in Serhiy's idea #1 is the price one has to pay *if* one insists on optimising str.format(). str.format just a method like any other. The reasons would have to be very strong to give it special-case semantics. I agree that the cases are rare in which one would notice a difference. (Mostly I think it would be a surprise during debugging.) But I think users should be able to rely on the semantics of call. Easier optimisation doesn't seem to me a strong enough argument. This leaves me at: 1: +1 2a, 2b: +0 3: -1 Jeff Allen

On 3/29/2018 6:17 AM, Jeff Allen wrote:
Thanks for your work on Jython. And hop on the f-string bandwagon!
I don't think we should restrict f-strings to having to evaluate all of the expressions before formatting. But, if we do restrict it, we should document whatever the order is in 3.6 and add tests to ensure the behavior doesn't change.
#1 seems so complex as to not be worth it, given the likely small overall impact of the optimization to a large program. If the speedup really is sufficiently important for a particular piece of code, I'd suggest just rewriting the code to use f-strings, and the author could then determine if the transformation breaks anything. Maybe write a 2to3 like tool that would identify places where str.format or %-formatting could be replaced by f-strings? I know I'd run it on my code, if it existed. Because the optimization can only work code with literals, I think manually modifying the source code is an acceptable solution if the possible change in semantics implied by #3 are unacceptable. Eric.

On 29 March 2018 at 21:50, Eric V. Smith <eric@trueblade.com> wrote:
While more projects are starting to actively drop Python 2.x support, there are also quite a few still straddling the two different versions. The "rewrite to f-strings" approach requires explicitly dropping support for everything below 3.6, whereas implicit optimization of literal based formatting will work even for folks preserving backwards compatibility with older versions. As far as the semantics go, perhaps it would be possible to explicitly create a tuple as part of the implementation to ensure that the arguments are still evaluated in order, and everything gets calculated exactly once? This would have the benefit that even format strings that used numbered references could be optimised in a fairly straightforward way. '{}{}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[0]}{_hidden_ref[1]}' while: '{1}{0}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[1]}{_hidden_ref[0]}' This would probably need to be implemented as Serhiy's option 1 (generating a distinct AST node), which in turn leads to 2a: adding extra stack manipulation opcodes in order to more closely replicate str.format semantics. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 3/29/2018 12:13 PM, Nick Coghlan wrote:
Sure. But 3.6 will be 3 years old before this optimization is released. I've been seeing 3.4 support dropping off, and expect to see 3.5 follow suit by the time 3.8 is released. Although maybe the thought is to do this in a bug-fix release? If we're changing semantics at all, that seems like a non-starter.
I still think the complexity isn't worth it, but maybe I'm a lone voice on this. Eric.

On 30 March 2018 at 03:33, Eric V. Smith <eric@trueblade.com> wrote:
Definitely 3.8+ only. The nice thing about doing this implicitly at the compiler level is that it potentially provides an automatic performance improvement for existing libraries and applications. The justification for the extra complexity would then come from whether or not it actually measurably improve things, either for the benchmark suite, or for folks' real-world applications. Steven D'Aprano also raises a good point on that front: a FORMAT_STRING super-opcode that sped up f-strings *and* allowed semantics preserving constant-folding of str.format calls on string literals could make more sense than a change that focused solely on the implicit optimisation case. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

29.03.18 13:17, Jeff Allen пише:
I was going to optimize only formatting with implicit references. '{} {}' but not '{1} {0}' and either not '{0} {1}'. This guaranties in-order computation and referencing every subexpression only once. I don't have a goal of converting every string formatting, but only the most common and the most simple ones. If go further, we will need to add several new AST nodes (like for comprehensions).

On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote:
I don't understand this. Why would bool(a) be called twice, and when did this change? Surely calling it twice would be a bug. I just tried the oldest Python 3 I have on this computer, 3.2, and bool is only called once. -- Steve

On Fri, Mar 30, 2018 at 1:08 AM, Chris Angelico <rosuav@gmail.com> wrote:
Wow, I'm good. Premature send much? Nice going, Chris. Let's try that again. Here's some similar code:
This is, however, boolifying a, then boolifying x separately. To bool a twice, you'd need to write this instead:
In its optimized form, this still only boolifies a once. But we can defeat the optimization:
The "or False" part implies a booleanness check on its left operand, and the 'if' statement performs a boolean truthiness check on its result. That means two calls to __bool__ in the unoptimized form. But it gets optimized, on the assumption that __bool__ is a pure function. The version assigning to a temporary variable does one check before assigning, and then another check in the 'if'; the same thing without the temporary skips the second check, and just goes ahead and enters the body of the 'if'. Technically that's a semantic change. But I doubt it'll hurt anyone. ChrisA

On 3/28/2018 11:27 AM, Serhiy Storchaka wrote:
Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False or True.", should say something like "Should not have side-effects, as redundant bool calls may be optimized away (bool(bool(ob)) should have the same result as bool(ob))." -- Terry Jan Reedy

On Fri, Mar 30, 2018 at 3:29 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
And bool(obj) does always return True or False; if you define a __bool__ method that returns something else then bool rejects it and raises TypeError. So bool(bool(obj)) is already indistinguishable from bool(obj). However, the naive implementation of 'if a and True:' doesn't call bool(bool(a)), it calls bool(a) twice, and this *is* distinguishable by user code, at least in principle. If we want to change the language spec, I guess it would be with text like: "if bool(obj) would be called twice in immediate succession, with no other code in between, then the interpreter may assume that both calls would return the same value and elide one of them". -n -- Nathaniel J. Smith -- https://vorpus.org

On 30 March 2018 at 21:16, Nathaniel Smith <njs@pobox.com> wrote:
For example: >>> class FlipFlop: ... _state = False ... def __bool__(self): ... result = self._state ... self._state = not result ... return result ... >>> toggle = FlipFlop() >>> bool(toggle) False >>> bool(toggle) True >>> bool(toggle) False >>> bool(toggle) and bool(toggle) False >>> toggle and toggle <__main__.FlipFlop object at 0x7f35293604e0> >>> bool(toggle and toggle) True So the general principle is that __bool__ implementations shouldn't do anything that will change the result of the next call to __bool__, or else weirdness is going to result. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Mar 30, 2018 at 4:41 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't think this way of stating it is general enough. For example, you could have a nondeterministic implementation of __bool__ that doesn't itself carry any state (e.g. flipping the result with some probability), but the next call could nevertheless still return a different result. So I think Nathaniel's way of stating it is probably better:
--Chris

On 3/30/2018 6:29 AM, Serhiy Storchaka wrote:
That is what the parenthetical comment says, but it is not right in the context and should be deleted. For the "if a and True:" example, 'redundant bool calls may be optimized away.' might be better written as 'duplicate implied __bool__ calls may be avoided.' What I am trying to say is that *we* define the intended behavior of special methods, and we should define what an implementation may actually expect. The current optimizer expects __bool__ to have no side effects, at least none that it need respect. Having said what __bool__ should do, we can also say what it should not do to avoid possible surprises -- at least in production code, as opposed to 'testing' code like the examples in this thread. -- Terry Jan Reedy

On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote:
If the aim here is to be an optimization, then I vote strongly for 2b. That gives you *faster f-strings* that have the same order-of-evaluation of normal method calls, so that when you optimize str.format into an f-string, not only is the behaviour identical, but they will be even faster than with option 3. Python's execution model implies that obj.method(expression_a, expression_b) should fully evaluate both expressions before they are passed to the method. Making str.format a magical special case that violates that rule should be a last resort. In this case, we can have our cake and eat it too: both the str.format to f-string optimization and keeping the normal evaluation rules. And as a bonus, we make f-strings even faster. I say "we", but of course it is Serhiy doing the work, thank you. Is there a down-side to 2b? It sounds like something you might end up doing at a later date regardless of what you do now. -- Steve

30.03.18 02:16, Steven D'Aprano пише:
Is there a down-side to 2b? It sounds like something you might end up doing at a later date regardless of what you do now.
This complicate the compiler and the eval loop, especially in the case of nested substitutions in formats, like f'{value:+{width:d}.{prec:d}f}' The speed gain can be too small. The complex implementation of the opcode should be tightly integrated with the ceval loop, otherwise we can get a slow down, as in one of my experimental implementation of BUILD_STRING (https://bugs.python.org/issue27078#msg270505). The exact benefit is unknown until this feature be implemented.

On 30 March 2018 at 20:05, Serhiy Storchaka <storchaka@gmail.com> wrote:
This point reminded me that there's still https://www.python.org/dev/peps/pep-0536/ to consider as well (that's the PEP about migrating f-strings to a fully nested expression grammar rather than hijacking the existing string tokenisation code). I *think* that's an orthogonal concern (since it relates to the initial parsing and AST compilation phase, rather then the code generation and execution phase), but it's worth keeping in mind. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

[Steven D'Aprano <steve@pearwood.info>]
There are always downsides ;-) As Serhiy noted later, the idea that "it's faster" is an educated guess - you can't know before it's implemented. Changes to the very complicated eval loop often have not only surprising speed consequences on one platform, but even consequences in opposite directions across platforms. Not necessarily in the part you directly changed, either. Optimizing C compilers just can't reliably guess what's most important in such a massive pile of test-and-branch laden code. Indeed, which paths through the eval loop _are_ most important depend on the Python program you're running at the time (which is, e.g., why "profile-guided optimization" was invented). So there's an ocean of potential complications there, and wading through those has opportunity cost too: Serhiy is a very productive contributor, but time he spends on this is time he won't be spending on other things of potentially greater value. That's all up to him, though. I'm not keen on changing the behavior of f-strings regardless (2a or 2b). While their implementation details aren't documented, they were intentional, and follow the pattern increasingly large parts of the language and std library adopted after the iterator protocol was introduced: compute intermediate results as they're needed, not all in advance. That's proved to have many advantages. It's certainly possible to write custom purely functional (no side effects) __format__ methods such that memory use in an f-string remains bounded under the current implementation, but can grow without bound if all __format__ arguments need to be evaluated before any formatting begins. It's akin to the difference between iterating over range() and xrange() in Python 2. I don't know that there's any real f-string code out there _relying_ on that - but don't know that there isn't either. It's more plausible to me than that there are non-functional real __format__ methods. I'd be happiest if no behaviors changed in anything. Then the only downsides to optimizing are code bloat, code churn, new bugs, subtler corner cases, less predictable behavior for end users, and increased implementation complexity forever after ;-)

Serhiy Storchaka schrieb am 28.03.2018 um 17:27:
Just for the record, I implemented the translation-time transformation described in [1] (i.e. '%' string formatting with a tuple) in Cython 0.28 (released mid of March, see [3]), and it has the same problem of changing the behaviour. I was aware of this at the time I implemented it, but decided to postpone the fix of evaluating the arguments before formatting them, as I considered it low priority compared to the faster execution. I expect failures during formatting to be rare in real-world code, where string building tends to be at the end of a processing step that should catch many value problems already. Rare, but not impossible. I do consider this change in behaviour a bug that should be fixed, and I would also consider it a bug in CPython if it was added there. Stefan [3] https://github.com/cython/cython/blob/de618c0141ae818e7a4c35d46256d98e6b6dba...

Hm, without thinking too much about it I'd say it's okay to change the evaluation order. Can these optimizations be disabled with something like -O0? On Wed, Mar 28, 2018 at 8:27 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

28.03.18 19:20, Guido van Rossum пише:
Hm, without thinking too much about it I'd say it's okay to change the evaluation order.
Do you mean the option 3, right? This is the simplest option. I have already wrote a PR for optimizing old-style formating [1], but have not merged it yet due to this change of semantic.
Can these optimizations be disabled with something like -O0?
Currently there is no way to disable optimizations. There is an open issue with a request for this. [2] [1] https://github.com/python/cpython/pull/5012 [2] https://bugs.python.org/issue2506

28.03.18 22:05, Paul Moore пише
If formatting a and evaluating b both raise exceptions, the resulting exception depends on the order. $ ./python -bb >>> a = b'bytes' >>> '{}{}'.format(a, b) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'b' is not defined >>> f'{a}{b}' Traceback (most recent call last): File "<stdin>", line 1, in <module> BytesWarning: str() on a bytes instance

[Serhiy Storchaka <storchaka@gmail.com>]
I have a hard time imaging how that could have come to be, but if it's true I'd say the unoptimized code was plain wrong. The dumbest possible way to implement `f() and g()` is also the correct ;-) way: result = f() if not bool(result): result = g() For the thing you really care about here, the language guarantees `a` will be evaluated before `b` in: '{}{}'.format(a, b) but I'm not sure it says anything about how the format operations are interleaved. So your proposed transformation is fine by me (your #3: still evaluate `a` before `b` but ignore that the format operations may occur in a different order with respect to those).

[Tim]
Ah, I think I see your point now. In the _context_ of `if f() and g()`, the dumbest possible code generation would do the above, and then go on to do if bool(result): .... If in fact `f()` returned a false-like value, an optimizer could note that `bool(result)` had already been evaluated and skip the redundant evaluation. I think that's fine either way: what the language guarantees is that `f()` will be evaluated exactly once, and `g()` no more than once, and that's all so regardless.

28.03.18 21:30, Tim Peters пише:
Optimized if f() and g(): spam() is equivalent to result = f() if bool(result): result = g() if bool(result): spam() Without optimization it would be equivalent to result = f() if bool(result): result = g() if bool(result): spam() It calls bool() for the result of f() twice if it is false. Thus there is a small difference between if f() and g(): spam() and tmp = f() and g() if tmp: spam()

Serhiy Storchaka schrieb am 28.03.2018 um 17:27:
I agree that it's not normally a problem, but if the formatting of 'a' fails and raises an exception, then 'b' will not get evaluated at all in the second case. Whether this difference is subtle or not is seems to depend largely on the code at hand. Stefan

On 29 March 2018 at 08:09, Tim Delaney <timothy.c.delaney@gmail.com> wrote:
Sorry about that - finger slipped and I sent an incomplete email ... If I'm not mistaken, #3 would result in the optimiser changing str.format() into an f-string in-place. Is this correct? We're not talking here about people manually changing the code from str.format() to f-strings, right? I would argue that any optimisation needs to have the same semantics as the original code - in this case, that all arguments are evaluated before the string is formatted. I also assumed (not having actually used an f-string) that all its formatting arguments were evaluated before formatting. So my preference would be (if my understanding in the first line is correct): 1: +0 2a: +0.5 2b: +1 3: -1 Tim Delaney

[Tim Delaney <timothy.c.delaney@gmail.com>]
All correct. It's a magical transformation from one spelling to another.
That's why Serhiy is asking about it - there _are_ potentially visible changes in behavior under all but one of his suggestions.
I also assumed (not having actually used an f-string) that all its formatting arguments were evaluated before formatting.
It's a string - it doesn't have "arguments" as such. For example: def f(a, b, n): return f"{a+b:0{n}b}" # the leading "f" makes it an f-string Then
f(2, 3, 12) '000000000101'
The generated code currently interleaves evaluating expressions with formatting the results in a more-than-less obvious way, waiting until the end to paste all the formatted fragments together. As shown in the example, this can be more than one level deep (the example needs to paste together "0", str(n), and "b" to _build_ the format code for `a+b`).
That's the only suggestion with no potentially visible changes. I'll add another: leave `.format()` alone entirely - there's no _need_ to "optimize" it, it's just a maybe-nice-to-have.
2a: +0.5 2b: +1
Those two don't change the behaviors of `.format()`, but _do_ change some end-case behaviors of f-strings. If you're overly ;-) concerned about the former, it would be consistent to be overly concerned about the latter too.
3: -1
And that's the one that proposes to let .format() also interleave expression evaluation (but still strictly "left to right") with formatting. If it were a general code transformation, I'm sure everyone would be -1. As is, it's hard to care. String formatting is a tiny area, and format methods are generally purely functional (no side effects). If anyone has a non-contrived example where the change would make a lick of real difference, they shouldn't be shy about posting it :-) I looked, and can't find any in my code.

My credentials for this are that I re-worked str.format in Jython quite extensively, and I followed the design of f-strings a bit when they were introduced, but I haven't used them to write anything. On 29/03/2018 00:48, Tim Peters wrote:
Agreed "argument" is the wrong word, but so is "string". It's an expression returning a string, in which a, b and n are free variables. I think we can understand it best as a string-display (https://docs.python.org/3/reference/expressions.html#list-displays), or a sort of eval() call. The difference Serhiy identifies emerges (I think) because in the conventional interpretation of a format call, the arguments of format are evaluated left-to right (all of them) and then formatted in the order references are encountered to these values in a tuple or dictionary. In an f-string expressions are evaluated as they are encountered. A more testing example is therefore perhaps: '{1} {0}'.format(a(), b()) # E1 f'{b()}{a()}' # E2 I think I would be very surprised to find b called before a in E1 because of the general contract on the meaning of method calls. I'm assuming that's what an AST-based optimisation would do? There's no reason in E2 to call them in any other order than b then a and the documentation tells me they are. But do I expect a() to be called before the results of b() are formatted? In E1 I definitely expect that. In E2 I don't think I'd be surprised either way. Forced to guess, I would guess that b() would be formatted and in the output buffer before a() was called, since it gives the implementation fewer things to remember. Then I hope I would not depend on this guesswork. Strictly-speaking the documentation doesn't say when the result is formatted in relation to the evaluation of other expressions, so there is permission for Serhiy's idea #2. I think the (internal) AST change implied in Serhiy's idea #1 is the price one has to pay *if* one insists on optimising str.format(). str.format just a method like any other. The reasons would have to be very strong to give it special-case semantics. I agree that the cases are rare in which one would notice a difference. (Mostly I think it would be a surprise during debugging.) But I think users should be able to rely on the semantics of call. Easier optimisation doesn't seem to me a strong enough argument. This leaves me at: 1: +1 2a, 2b: +0 3: -1 Jeff Allen

On 3/29/2018 6:17 AM, Jeff Allen wrote:
Thanks for your work on Jython. And hop on the f-string bandwagon!
I don't think we should restrict f-strings to having to evaluate all of the expressions before formatting. But, if we do restrict it, we should document whatever the order is in 3.6 and add tests to ensure the behavior doesn't change.
#1 seems so complex as to not be worth it, given the likely small overall impact of the optimization to a large program. If the speedup really is sufficiently important for a particular piece of code, I'd suggest just rewriting the code to use f-strings, and the author could then determine if the transformation breaks anything. Maybe write a 2to3 like tool that would identify places where str.format or %-formatting could be replaced by f-strings? I know I'd run it on my code, if it existed. Because the optimization can only work code with literals, I think manually modifying the source code is an acceptable solution if the possible change in semantics implied by #3 are unacceptable. Eric.

On 29 March 2018 at 21:50, Eric V. Smith <eric@trueblade.com> wrote:
While more projects are starting to actively drop Python 2.x support, there are also quite a few still straddling the two different versions. The "rewrite to f-strings" approach requires explicitly dropping support for everything below 3.6, whereas implicit optimization of literal based formatting will work even for folks preserving backwards compatibility with older versions. As far as the semantics go, perhaps it would be possible to explicitly create a tuple as part of the implementation to ensure that the arguments are still evaluated in order, and everything gets calculated exactly once? This would have the benefit that even format strings that used numbered references could be optimised in a fairly straightforward way. '{}{}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[0]}{_hidden_ref[1]}' while: '{1}{0}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[1]}{_hidden_ref[0]}' This would probably need to be implemented as Serhiy's option 1 (generating a distinct AST node), which in turn leads to 2a: adding extra stack manipulation opcodes in order to more closely replicate str.format semantics. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 3/29/2018 12:13 PM, Nick Coghlan wrote:
Sure. But 3.6 will be 3 years old before this optimization is released. I've been seeing 3.4 support dropping off, and expect to see 3.5 follow suit by the time 3.8 is released. Although maybe the thought is to do this in a bug-fix release? If we're changing semantics at all, that seems like a non-starter.
I still think the complexity isn't worth it, but maybe I'm a lone voice on this. Eric.

On 30 March 2018 at 03:33, Eric V. Smith <eric@trueblade.com> wrote:
Definitely 3.8+ only. The nice thing about doing this implicitly at the compiler level is that it potentially provides an automatic performance improvement for existing libraries and applications. The justification for the extra complexity would then come from whether or not it actually measurably improve things, either for the benchmark suite, or for folks' real-world applications. Steven D'Aprano also raises a good point on that front: a FORMAT_STRING super-opcode that sped up f-strings *and* allowed semantics preserving constant-folding of str.format calls on string literals could make more sense than a change that focused solely on the implicit optimisation case. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

29.03.18 13:17, Jeff Allen пише:
I was going to optimize only formatting with implicit references. '{} {}' but not '{1} {0}' and either not '{0} {1}'. This guaranties in-order computation and referencing every subexpression only once. I don't have a goal of converting every string formatting, but only the most common and the most simple ones. If go further, we will need to add several new AST nodes (like for comprehensions).

On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote:
I don't understand this. Why would bool(a) be called twice, and when did this change? Surely calling it twice would be a bug. I just tried the oldest Python 3 I have on this computer, 3.2, and bool is only called once. -- Steve

On Fri, Mar 30, 2018 at 1:08 AM, Chris Angelico <rosuav@gmail.com> wrote:
Wow, I'm good. Premature send much? Nice going, Chris. Let's try that again. Here's some similar code:
This is, however, boolifying a, then boolifying x separately. To bool a twice, you'd need to write this instead:
In its optimized form, this still only boolifies a once. But we can defeat the optimization:
The "or False" part implies a booleanness check on its left operand, and the 'if' statement performs a boolean truthiness check on its result. That means two calls to __bool__ in the unoptimized form. But it gets optimized, on the assumption that __bool__ is a pure function. The version assigning to a temporary variable does one check before assigning, and then another check in the 'if'; the same thing without the temporary skips the second check, and just goes ahead and enters the body of the 'if'. Technically that's a semantic change. But I doubt it'll hurt anyone. ChrisA

On 3/28/2018 11:27 AM, Serhiy Storchaka wrote:
Perhaps Ref 3.3.1 object.__bool__ entry, after " should return False or True.", should say something like "Should not have side-effects, as redundant bool calls may be optimized away (bool(bool(ob)) should have the same result as bool(ob))." -- Terry Jan Reedy

On Fri, Mar 30, 2018 at 3:29 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
And bool(obj) does always return True or False; if you define a __bool__ method that returns something else then bool rejects it and raises TypeError. So bool(bool(obj)) is already indistinguishable from bool(obj). However, the naive implementation of 'if a and True:' doesn't call bool(bool(a)), it calls bool(a) twice, and this *is* distinguishable by user code, at least in principle. If we want to change the language spec, I guess it would be with text like: "if bool(obj) would be called twice in immediate succession, with no other code in between, then the interpreter may assume that both calls would return the same value and elide one of them". -n -- Nathaniel J. Smith -- https://vorpus.org

On 30 March 2018 at 21:16, Nathaniel Smith <njs@pobox.com> wrote:
For example: >>> class FlipFlop: ... _state = False ... def __bool__(self): ... result = self._state ... self._state = not result ... return result ... >>> toggle = FlipFlop() >>> bool(toggle) False >>> bool(toggle) True >>> bool(toggle) False >>> bool(toggle) and bool(toggle) False >>> toggle and toggle <__main__.FlipFlop object at 0x7f35293604e0> >>> bool(toggle and toggle) True So the general principle is that __bool__ implementations shouldn't do anything that will change the result of the next call to __bool__, or else weirdness is going to result. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Mar 30, 2018 at 4:41 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't think this way of stating it is general enough. For example, you could have a nondeterministic implementation of __bool__ that doesn't itself carry any state (e.g. flipping the result with some probability), but the next call could nevertheless still return a different result. So I think Nathaniel's way of stating it is probably better:
--Chris

On 3/30/2018 6:29 AM, Serhiy Storchaka wrote:
That is what the parenthetical comment says, but it is not right in the context and should be deleted. For the "if a and True:" example, 'redundant bool calls may be optimized away.' might be better written as 'duplicate implied __bool__ calls may be avoided.' What I am trying to say is that *we* define the intended behavior of special methods, and we should define what an implementation may actually expect. The current optimizer expects __bool__ to have no side effects, at least none that it need respect. Having said what __bool__ should do, we can also say what it should not do to avoid possible surprises -- at least in production code, as opposed to 'testing' code like the examples in this thread. -- Terry Jan Reedy

On Wed, Mar 28, 2018 at 06:27:19PM +0300, Serhiy Storchaka wrote:
If the aim here is to be an optimization, then I vote strongly for 2b. That gives you *faster f-strings* that have the same order-of-evaluation of normal method calls, so that when you optimize str.format into an f-string, not only is the behaviour identical, but they will be even faster than with option 3. Python's execution model implies that obj.method(expression_a, expression_b) should fully evaluate both expressions before they are passed to the method. Making str.format a magical special case that violates that rule should be a last resort. In this case, we can have our cake and eat it too: both the str.format to f-string optimization and keeping the normal evaluation rules. And as a bonus, we make f-strings even faster. I say "we", but of course it is Serhiy doing the work, thank you. Is there a down-side to 2b? It sounds like something you might end up doing at a later date regardless of what you do now. -- Steve

30.03.18 02:16, Steven D'Aprano пише:
Is there a down-side to 2b? It sounds like something you might end up doing at a later date regardless of what you do now.
This complicate the compiler and the eval loop, especially in the case of nested substitutions in formats, like f'{value:+{width:d}.{prec:d}f}' The speed gain can be too small. The complex implementation of the opcode should be tightly integrated with the ceval loop, otherwise we can get a slow down, as in one of my experimental implementation of BUILD_STRING (https://bugs.python.org/issue27078#msg270505). The exact benefit is unknown until this feature be implemented.

On 30 March 2018 at 20:05, Serhiy Storchaka <storchaka@gmail.com> wrote:
This point reminded me that there's still https://www.python.org/dev/peps/pep-0536/ to consider as well (that's the PEP about migrating f-strings to a fully nested expression grammar rather than hijacking the existing string tokenisation code). I *think* that's an orthogonal concern (since it relates to the initial parsing and AST compilation phase, rather then the code generation and execution phase), but it's worth keeping in mind. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

[Steven D'Aprano <steve@pearwood.info>]
There are always downsides ;-) As Serhiy noted later, the idea that "it's faster" is an educated guess - you can't know before it's implemented. Changes to the very complicated eval loop often have not only surprising speed consequences on one platform, but even consequences in opposite directions across platforms. Not necessarily in the part you directly changed, either. Optimizing C compilers just can't reliably guess what's most important in such a massive pile of test-and-branch laden code. Indeed, which paths through the eval loop _are_ most important depend on the Python program you're running at the time (which is, e.g., why "profile-guided optimization" was invented). So there's an ocean of potential complications there, and wading through those has opportunity cost too: Serhiy is a very productive contributor, but time he spends on this is time he won't be spending on other things of potentially greater value. That's all up to him, though. I'm not keen on changing the behavior of f-strings regardless (2a or 2b). While their implementation details aren't documented, they were intentional, and follow the pattern increasingly large parts of the language and std library adopted after the iterator protocol was introduced: compute intermediate results as they're needed, not all in advance. That's proved to have many advantages. It's certainly possible to write custom purely functional (no side effects) __format__ methods such that memory use in an f-string remains bounded under the current implementation, but can grow without bound if all __format__ arguments need to be evaluated before any formatting begins. It's akin to the difference between iterating over range() and xrange() in Python 2. I don't know that there's any real f-string code out there _relying_ on that - but don't know that there isn't either. It's more plausible to me than that there are non-functional real __format__ methods. I'd be happiest if no behaviors changed in anything. Then the only downsides to optimizing are code bloat, code churn, new bugs, subtler corner cases, less predictable behavior for end users, and increased implementation complexity forever after ;-)

Serhiy Storchaka schrieb am 28.03.2018 um 17:27:
Just for the record, I implemented the translation-time transformation described in [1] (i.e. '%' string formatting with a tuple) in Cython 0.28 (released mid of March, see [3]), and it has the same problem of changing the behaviour. I was aware of this at the time I implemented it, but decided to postpone the fix of evaluating the arguments before formatting them, as I considered it low priority compared to the faster execution. I expect failures during formatting to be rare in real-world code, where string building tends to be at the end of a processing step that should catch many value problems already. Rare, but not impossible. I do consider this change in behaviour a bug that should be fixed, and I would also consider it a bug in CPython if it was added there. Stefan [3] https://github.com/cython/cython/blob/de618c0141ae818e7a4c35d46256d98e6b6dba...
participants (14)
-
Chris Angelico
-
Chris Jerdonek
-
Eric V. Smith
-
Guido van Rossum
-
Jeff Allen
-
Nathaniel Smith
-
Nick Coghlan
-
Paul Moore
-
Serhiy Storchaka
-
Stefan Behnel
-
Steven D'Aprano
-
Terry Reedy
-
Tim Delaney
-
Tim Peters