
On 29 March 2018 at 21:50, Eric V. Smith <eric@trueblade.com> wrote:
#1 seems so complex as to not be worth it, given the likely small overall impact of the optimization to a large program. If the speedup really is sufficiently important for a particular piece of code, I'd suggest just rewriting the code to use f-strings, and the author could then determine if the transformation breaks anything. Maybe write a 2to3 like tool that would identify places where str.format or %-formatting could be replaced by f-strings? I know I'd run it on my code, if it existed. Because the optimization can only work code with literals, I think manually modifying the source code is an acceptable solution if the possible change in semantics implied by #3 are unacceptable.
While more projects are starting to actively drop Python 2.x support, there are also quite a few still straddling the two different versions. The "rewrite to f-strings" approach requires explicitly dropping support for everything below 3.6, whereas implicit optimization of literal based formatting will work even for folks preserving backwards compatibility with older versions. As far as the semantics go, perhaps it would be possible to explicitly create a tuple as part of the implementation to ensure that the arguments are still evaluated in order, and everything gets calculated exactly once? This would have the benefit that even format strings that used numbered references could be optimised in a fairly straightforward way. '{}{}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[0]}{_hidden_ref[1]}' while: '{1}{0}'.format(a, b) would become: _hidden_ref = (a, b) f'{_hidden_ref[1]}{_hidden_ref[0]}' This would probably need to be implemented as Serhiy's option 1 (generating a distinct AST node), which in turn leads to 2a: adding extra stack manipulation opcodes in order to more closely replicate str.format semantics. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia