Briefer string format
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
Have long wished python could format strings easily like bash or perl do, ... and then it hit me: csstext += f'{nl}{selector}{space}{{{nl}' (This script included whitespace vars to provide a minification option.) I've seen others make similar suggestions, but to my knowledge they didn't include this pleasing brevity aspect. -Mike
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
Hi, Ok, I kept the message brief because I thought this subject had previously been discussed often. I've expanded it to explain better for those that are interested. --- Needed to whip-up some css strings, took a look at the formatting I had done and thought it was pretty ugly. I started with the printf style, and had pulled out the whitespace as vars in order to have a minification option: csstext += '%s%s%s{%s' % (nl, key, space, nl) Decent but not great, a bit hard on the eyes. So I decided to try .format(): csstext += '{nl}{key}{space}{{{nl}'.format(**locals()) This looks a bit better if you ignore the right half, but it is longer and not as simple as one might hope. It is much longer still if you type out the variables needed as kewword params! The '{}' option is not much improvement either. csstext += '{nl}{key}{space}{{{nl}'.format(nl=nl, key=key, ... # uggh csstext += '{}{}{}{{{}'.format(nl, key, space, nl) I've long wished python could format strings easily like bash or perl do, ... and then it hit me: csstext += f'{nl}{key}{space}{{{nl}' An "f-formatted" string could automatically format with the locals dict. Not yet sure about globals, and unicode only suggested for now. Perhaps could be done directly to avoid the .format() function call, which adds some overhead and tends to double the length of the line? I remember a GvR talk a few years ago giving a 'meh' on .format() and have agreed, using it only when I have a very large or complicated string-building need, at the point where it begins to overlap Jinja territory. Perhaps this is one way to make it more comfortable for everyday usage. I've seen others make similar suggestions, but to my knowledge they didn't include this pleasing brevity aspect. -Mike On 07/19/2015 04:27 PM, Eric V. Smith wrote:
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 20, 2015 at 9:35 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
Point to note: Currently, all the string prefixes are compile-time directives only. A b"bytes" or u"unicode" prefix affects what kind of object is produced, and all the others are just syntactic differences. In all cases, a string literal is a single immutable object which can be stashed away as a constant. What you're suggesting here is a thing that looks like a literal, but is actually a run-time operation. As such, I'm pretty dubious; coupled with the magic of dragging values out of the enclosing namespace, it's going to be problematic as regards code refactoring. Also, you're going to have heaps of people arguing that this should be a shorthand for str.format(**locals()), and about as many arguing that it should follow the normal name lookups (locals, nonlocals, globals, builtins). I'm -1 on the specific idea, though definitely sympathetic to the broader concept of simplified formatting of strings. Python's printf-style formatting has its own warts (mainly because of the cute use of an operator, rather than doing it as a function call), and still has the problem of having percent markers with no indication of what they'll be interpolating in. Anything that's explicit is excessively verbose, anything that isn't is cryptic. There's no easy fix. ChrisA
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
"Point to note: Currently, all the string prefixes are compile-time directives only. A b"bytes" or u"unicode" prefix affects what kind of object is produced, and all the others are just syntactic differences. In all cases, a string literal is a single immutable object which can be stashed away as a constant. What you're suggesting here is a thing that looks like a literal, but is actually a run-time operation." Why wouldn't this be a compile time transform from f"string with braces" into "string with braces".format(x=x, y=y, ...) where x, y, etc are the names in each pair of braces (with an error if it can't get a valid identifier out of each format code)? It's syntactic sugar for a simple function call with perfectly well defined semantics - you don't even have to modify the string literal. Defined as a compile time transform like this, I'm +1. As soon as any suggestion mentions "locals()" or "globals()" I'm -1. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Chris Angelico<mailto:rosuav@gmail.com> Sent: 7/19/2015 16:44 Cc: python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On Mon, Jul 20, 2015 at 9:35 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
Point to note: Currently, all the string prefixes are compile-time directives only. A b"bytes" or u"unicode" prefix affects what kind of object is produced, and all the others are just syntactic differences. In all cases, a string literal is a single immutable object which can be stashed away as a constant. What you're suggesting here is a thing that looks like a literal, but is actually a run-time operation. As such, I'm pretty dubious; coupled with the magic of dragging values out of the enclosing namespace, it's going to be problematic as regards code refactoring. Also, you're going to have heaps of people arguing that this should be a shorthand for str.format(**locals()), and about as many arguing that it should follow the normal name lookups (locals, nonlocals, globals, builtins). I'm -1 on the specific idea, though definitely sympathetic to the broader concept of simplified formatting of strings. Python's printf-style formatting has its own warts (mainly because of the cute use of an operator, rather than doing it as a function call), and still has the problem of having percent markers with no indication of what they'll be interpolating in. Anything that's explicit is excessively verbose, anything that isn't is cryptic. There's no easy fix. ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 20, 2015 at 10:43 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
It'd obviously have to be a compile-time transformation. My point is that it would, unlike all other forms of literal, translate into a function call. How is your "x=x, y=y" version materially different from explicitly mentioning locals() or globals()? The only significant difference is that your version follows the scope order outward, where locals() and globals() call up a specific scope each. Will an f"..." format string be mergeable with other strings? All the other types of literal can be (apart, of course, from mixing bytes and unicode), but this would have to be something somehow different. In every way that I can think of, this is not a literal - it is a construct that results in a run-time operation. A context-dependent operation, at that. That's why I'm -1 on this looking like a literal. ChrisA
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Chris Angelico wrote:
Excluding dictionary literals, of course. And class definitions. Decorators too, and arguably the descriptor protocol and __getattribute__ make things that look like attribute lookups into function calls. Python is littered with these, so I'm not sure that your point has any historical support.
Yes, it follows normal scoping rules and doesn't invent/define/describe new ones for this particular case. There is literally no difference between the function call version and the prefix version wrt scoping. As an example of why "normal rules" are better than "locals()/globals()", how would you implement this using just locals() and globals()?
Given that this is the current behaviour:
I don't mind saying "no" here, especially since the merging is done while compiling, but it would be possible to generate a runtime concatentation here. Again, you only "know" that code (currently) has no runtime effect because, well, because you know it. It's a change, but it isn't world ending.
In every way that I can think of, this is not a literal - it is a construct that results in a run-time operation.
Most new Python developers (with backgrounds in other languages) are surprised that "class" is a construct that results in a run-time operation, and would be surprised that writing a dictionary literal also results in a run-time operation if they ever had reason to notice. I believe the same would apply here.
A context-dependent operation, at that.
You'll need to explain this one for me - how is it "context-dependent" when you are required to provide a string prefix?
That's why I'm -1 on this looking like a literal.
I hope you'll reconsider, because I think you're working off some incorrect or over-simplified beliefs. (Though this reply isn't just intended for Chris, but for everyone following the discussion, so I hope *everyone* considers both sides.) Cheers, Steve
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
"return [locals()[x] for _ in range(1)]" I lost some quotes here around the x, but it doesn't affect the behavior - you still can't get outside the comprehension scope here. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Steve Dower<mailto:Steve.Dower@microsoft.com> Sent: 7/19/2015 18:49 To: Chris Angelico<mailto:rosuav@gmail.com> Cc: python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format Chris Angelico wrote:
Excluding dictionary literals, of course. And class definitions. Decorators too, and arguably the descriptor protocol and __getattribute__ make things that look like attribute lookups into function calls. Python is littered with these, so I'm not sure that your point has any historical support.
Yes, it follows normal scoping rules and doesn't invent/define/describe new ones for this particular case. There is literally no difference between the function call version and the prefix version wrt scoping. As an example of why "normal rules" are better than "locals()/globals()", how would you implement this using just locals() and globals()?
Given that this is the current behaviour:
I don't mind saying "no" here, especially since the merging is done while compiling, but it would be possible to generate a runtime concatentation here. Again, you only "know" that code (currently) has no runtime effect because, well, because you know it. It's a change, but it isn't world ending.
In every way that I can think of, this is not a literal - it is a construct that results in a run-time operation.
Most new Python developers (with backgrounds in other languages) are surprised that "class" is a construct that results in a run-time operation, and would be surprised that writing a dictionary literal also results in a run-time operation if they ever had reason to notice. I believe the same would apply here.
A context-dependent operation, at that.
You'll need to explain this one for me - how is it "context-dependent" when you are required to provide a string prefix?
That's why I'm -1 on this looking like a literal.
I hope you'll reconsider, because I think you're working off some incorrect or over-simplified beliefs. (Though this reply isn't just intended for Chris, but for everyone following the discussion, so I hope *everyone* considers both sides.) Cheers, Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 20, 2015 at 11:33 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Dictionary/list display isn't a literal, and every time it's evaluated, you get a brand new object, not another reference to the same literal. Compare:
Class and function definitions are also not literals, although people coming from other languages are often confused by this. (I've seen people write functions down the bottom of the file that are needed by top-level code higher up. It's just something you have to learn - Python doesn't "declare" functions, it "defines" them.) Going the other direction, there are a few things that you might think are literals but aren't technically so, such as "2+3j", which is actually two literals (int and imaginary) and a binary operation; but constant folding makes them functionally identical to constants. The nearest equivalent to this proposal is tuple display, which can sometimes function almost like a literal:
This disassembles to a simple fetching of a constant. However, it's really just like list display plus constant folding - the compiler notices that it'll always produce the same tuple, so it optimizes it down to a constant. In none of these cases is a string ever anything other than a simple constant. That's why this proposal is a distinct change; all of the cases where Python has non-constants that might be thought of as constants, they contain expressions (or even statements - class/function definitions), and are syntactically NOT single entities. Now, that's not to say that it cannot possibly be done. But I personally am not in favour of it.
Sure, that's where following the scoping rules is better than explicitly calling up locals(). On the flip side, going for locals() alone means you can easily and simply protect your code against grabbing the "wrong" things, by simply putting it inside a nested function (which is what your list comp there is doing), and it's also easier to explain what this construct does in terms of locals() than otherwise (what if there are attribute lookups or subscripts?).
Fair enough. I wouldn't mind saying "no" here too - in the same way that it's a SyntaxError to write u"hello" b"world", it would be a SyntaxError to mix either with f"format string".
That's part of learning the language (which things are literals and which aren't). Expanding the scope of potential confusion is a definite cost; I'm open to the argument that the benefit justifies that cost, but it is undoubtedly a cost.
A context-dependent operation, at that.
You'll need to explain this one for me - how is it "context-dependent" when you are required to provide a string prefix?
def func1(): x = "world" return f"Hello, {x}!" def func2(): return f"Hello, {x}!" They both return what looks like a simple string, but in one, it grabs a local x, and in the other, it goes looking for a global. This is one of the potential risks of such things as decimal.Decimal literals, because literals normally aren't context-dependent, but the Decimal constructor can be affected by precision controls and such. Again, not a killer, but another cost.
That's why I'm -1 on this looking like a literal.
I hope you'll reconsider, because I think you're working off some incorrect or over-simplified beliefs. (Though this reply isn't just intended for Chris, but for everyone following the discussion, so I hope *everyone* considers both sides.)
Having read your above responses, I'm now -0.5 on this proposal. There is definite merit to it, but I'm a bit dubious that it'll end up with one of the problems of PHP code: the uncertainty of whether something is a string or a piece of code. Some editors syntax-highlight all strings as straight-forward strings, same color all the way, while others will change color inside there to indicate interpolation sites. Which is correct? At least here, the prefix on the string makes it clear that this is a piece of code; but it'll take editors a good while to catch up and start doing the right thing - for whatever definition of "right thing" the authors choose. Maybe my beliefs are over-simplified, in which case I'll be happy to be proven wrong by some immensely readable and clear real-world code examples. :) ChrisA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 10:43, Steve Dower <Steve.Dower@microsoft.com> wrote:
I'm opposed to a special case compile time transformation for string formatting in particular, but in favour of clearly-distinct-from-anything-else syntax for such utilities: https://mail.python.org/pipermail/python-ideas/2015-June/033920.html It would also need a "compile time import" feature for namespacing purposes, so you might be able to write something like: from !string import format # Compile time import # Compile time transformation that emits a syntax error for a malformed format string formatted = !format("string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call") # Equivalent explicit code (but without any compile time checking of format string validity) formatted = "string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call".format(name=name, lookups=lookups) The key for me is that any such operation *must* be transparent to the compiler, so it knows exactly what names you're looking up and can generate the appropriate references for them (including capturing closure variables if necessary). If it's opaque to the compiler, then it's no better than just using a string, which we can already do today. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
So, macros basically? The real ones, not #define. What's wrong with special casing text strings (a little bit more than they already have been)? Top-posted from my Windows Phone ________________________________ From: Nick Coghlan<mailto:ncoghlan@gmail.com> Sent: 7/19/2015 21:34 To: Steve Dower<mailto:Steve.Dower@microsoft.com> Cc: Chris Angelico<mailto:rosuav@gmail.com>; python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On 20 July 2015 at 10:43, Steve Dower <Steve.Dower@microsoft.com> wrote:
I'm opposed to a special case compile time transformation for string formatting in particular, but in favour of clearly-distinct-from-anything-else syntax for such utilities: https://mail.python.org/pipermail/python-ideas/2015-June/033920.html It would also need a "compile time import" feature for namespacing purposes, so you might be able to write something like: from !string import format # Compile time import # Compile time transformation that emits a syntax error for a malformed format string formatted = !format("string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call") # Equivalent explicit code (but without any compile time checking of format string validity) formatted = "string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call".format(name=name, lookups=lookups) The key for me is that any such operation *must* be transparent to the compiler, so it knows exactly what names you're looking up and can generate the appropriate references for them (including capturing closure variables if necessary). If it's opaque to the compiler, then it's no better than just using a string, which we can already do today. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 14:46, Steve Dower <Steve.Dower@microsoft.com> wrote:
I've wished for a cleaner shell command invocation syntax many more times than I've wished for easier string formatting, but I *have* wished for both. Talking to the scientific Python folks, they've often wished for a cleaner syntax to create deferred expressions with the full power of Python's statement level syntax. Explicitly named macros could deliver all three of those, without the downsides of implicit globally installed macros that are indistinguishable from regular syntax. By contrast, the string prefix system is inherently cryptic (being limited to single letters only) and not open to extension and experimentation outside the reference interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 19, 2015, at 21:58, Nick Coghlan <ncoghlan@gmail.com> wrote:
MacroPy already gives you macros that are explicitly imported, and explicitly marked on use, and nicely readable. And it already works, with no changes to Python, and it includes a ton of little features that you'd never want to add to core Python. There are definitely changes to Python that could make it easier to improve MacroPy or start a competing project, but I think it would be more useful to identify and implement those changes than to try to build a macro system into Python itself. (Making it possible to work on the token level, or to associate bytes/text/tokens/trees/code with each other more easily, or to hook syntax errors and reprocess the bytes/text/tokens, etc. are some such ideas. But I think the most important stuff wouldn't be new features, but removing the annoyances that get in the way of trying to build the simplest possible new macro system from scratch for 3.5, and we probably can't know what those are until someone attempts to build such a thing.)
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 15:18, Andrew Barnert <abarnert@yahoo.com> wrote:
I see nothing explicit about https://pypi.python.org/pypi/MacroPy or the examples at https://github.com/lihaoyi/macropy#macropy, as it looks just like normal Python code to me, with no indication that compile time modifications are taking place. That's not MacroPy's fault - it *can't* readily be explicit the way I would want it to be if it's going to reuse the existing AST compiler to do the heavy lifting. However, I agree the MacroPy approach to tree transformations could be a good backend concept. I'd previously wondered how you'd go about embedding third party syntax like shell expressions or format strings, but eventually realised that combining an AST transformation syntax with string quoting works just fine there. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 19, 2015, at 22:43, Nick Coghlan <ncoghlan@gmail.com> wrote:
I suppose what I meant by explicit is things like using [] instead of () for macro calls and quick lambda definitions, s[""] for string interpolation, etc. Once you get used to it, it's usually obvious at a glance where code is using MacroPy.
I think you want to be able to hook the tokenizer here as well. If you want f"..." of !f"...", that's hard to do at the tree level or the text level; you'd have to do something like f("...") or "f..." instead. But at the token level, it should be trivial. (Well, the second one may not be _quite_ trivial, because I believe there are some cases where you get a !f error instead of a ! error and an f name; I'd have to check.) I'm pretty sure I could turn my user literal suffix hack into an f-prefix hack in 15 minutes or so. (Obviously it would be nicer if it were doable in a more robust way, and using a framework rather than rewriting all the boilerplate. But my point is that, even without any support at all, it's still not that hard.) I think you could also use token transformations to do a lot of useful shell-type expressions without quoting, although not full shell syntax; you'd have to play around with the limitations to see if they're worth the benefit of not needing to quote the whole thing. But as I said before, the real test would be trying to build the framework mentioned parenthetically above and see where it gets annoying and what could change between 3.5 and 3.6 to unannoyingize the code.
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 18:01, Andrew Barnert <abarnert@yahoo.com> wrote:
Is this code using MacroPy for compile time transformations? data = source[lookup] You have no idea, and neither do I. Instead, we'd be relying on our rote memory to recognise certain *names* as being typical MacroPy operations - if someone defines a new transformation, our pattern recognition isn't going to trigger properly. It doesn't help that my rote memory is awful, so I *detest* APIs that expect me to have a good one and hence "just know" when certain operations are special and don't work the same way as other operations. My suggested "!(expr)" notation is based on the idea of providing an inline syntactic marker to say "magic happening here" (with the default anonymous transformation being to return the AST object itself).
I figured out that AST->AST is fine, as anything else can be handled as quoted string transformations, which then gets you all the nice benefits of strings literals (choice of single or double quotes, triple-quoting for multi-line strings, escape sequences with the option of raw strings, etc), plus a clear inline marker alerting the reader to the fact that you've dropped out of Python's normal syntactic restrictions. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Chris Angelico writes:
I'm -1 on the specific idea, though definitely sympathetic to the broader concept of simplified formatting of strings.
So does everybody. But we've seen many iterations: Perl/shell-style implicit interpolation apparently was right out from the very beginning of Python. The magic print statement was then deprecated in favor of a function. So I suppose it will be very hard to convince the BDFL (and anything implicit would surely need his approval) of anything but a function or an operator. We have the % operator taking a printf-style format string and a tuple of values to interpolate. It's compact and easy to use with position indexes into the tuple for short formats and few values, but is nearly unreadable and not easy to write for long formats with many interpolations, especially if they are repeated.
I think the operator is actually a useful feature, not merely "cute". It directs the focus to the format string, rather than the function call.
and still has the problem of having percent markers with no indication of what they'll be interpolating in.
Not so. We have the more modern (?) % operator that takes a format string with named format sequences and a dictionary. This seems to be close to what the OP wants: val = "readable simple formatting method" print("This is a %(val)s." % locals()) (which actually works at module level as well as within a function). I suppose the OP will claim that an explicit call to locals() is verbose and redundant, but if that really is a problem: def format_with_locals(fmtstr): return fmtstr % locals() (of course with a nice short name, mnemonic to the author). Or for format strings to be used repeatedly with different (global -- the "locals" you want are actually nonlocal relative to a method, so there's no way to get at them AFAICS) values, there's this horrible hack: >>> class autoformat_with_globals(str): ... def __pos__(self): ... return self % globals() ... >>> a = autoformat_with_globals("This is a %(description)s.") >>> description = "autoformatted string" >>> +a 'This is a autoformatted string.' with __neg__ and __invert__ as alternative horrible hacks. We have str.format. I've gotten used to str.format but for most of my uses mapped %-formatting would work fine. We have an older proposal for a more flexible form of templating using the Perl/shell-ish $ operator in format strings. And we have a large number of templating languages from web frameworks (Django, Jinja, etc). None of these seem universally applicable. It's ugly in one sense (TOOWTDI violation), but ISTM that positional % for short interactive use, mapped % for templating where the conventional format operators suffice, and str.format for maximum explicit flexibility in programs, with context-sensitive formatting of new types, is an excellent combination.
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
Automatically injecting from the locals or globals is a nice source of bugs. Explicit is better than implicit, especially in case where it can lead to security bugs. -1 --- Bruce Check out my new puzzle book: http://J.mp/ingToConclusions Get it free here: http://J.mp/ingToConclusionsFree (available on iOS) On Sun, Jul 19, 2015 at 4:35 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
data:image/s3,"s3://crabby-images/5dd46/5dd46d9a69ae935bb5fafc0a5020e4a250324784" alt=""
Hello, On Sun, 19 Jul 2015 16:35:01 -0700 Mike Miller <python-ideas@mgmiller.net> wrote: []
"Not sure" sounds convincing. Deal - let's keep being explicit rather than implicit. Brevity? def _(fmt, dict): return fmt.format(**dict) __ = globals() ___ = locals() foo = 42 _("{foo}", __()) If that's not terse enough, you can take Python3, and go thru Unicode planes looking for funky-looking letters, then you hopefully can reduce to .("{foo}", .()) Where dots aren't dots, but funky-looking letters. -- Best regards, Paul mailto:pmiscml@gmail.com
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
Hmm, I prefer this recipe sent to me directly by joejev:
For yours I'd use the "pile of poo" character: ;) 💩("{foo}", _()) Both of these might be slower and a bit more awkward than the f'' idea, though I like them. As to the original post, a pyflakes-type script might be able to look for name errors to assuage concerns, but as I mentioned before I believe the task of matching string/vars is still necessary. -Mike On 07/19/2015 04:59 PM, Paul Sokolovsky wrote:
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/19/2015 07:35 PM, Mike Miller wrote:
Disclaimer: not well tested code. This code basically does what you want. It eval's the variables in the caller's frame. Of course you have to be able to stomach the use of sys._getframe() and eval(): ####################################### import sys import string class Formatter(string.Formatter): def __init__(self, globals, locals): self.globals = globals self.locals = locals def get_value(self, key, args, kwargs): return eval(key, self.globals, self.locals) # default to looking at the parent's frame def f(str, level=1): frame = sys._getframe(level) formatter = Formatter(frame.f_globals, frame.f_locals) return formatter.format(str) ####################################### Usage: foo = 42 print(f('{foo}')) def get_closure(foo): def _(): foo # hack: else we see the global 'foo' when calling f() return f('{foo}:{sys}') return _ print(get_closure('c')()) def test(value): print(f('value:{value:^20}, open:{open}')) value = 7 open = 3 test(4+3j) del(open) test(4+5j) Produces: 42 c:<module 'sys' (built-in)> value: (4+3j) , open:3 value: (4+5j) , open:<built-in function open> Eric.
data:image/s3,"s3://crabby-images/52bd8/52bd80b85ad23b22cd55e442f406b4f3ee8efd9f" alt=""
I would prefer something more like: def f(s): caller = inspect.stack()[1][0] return s.format(dict(caller.f_globals, **caller.f_locals)) On July 20, 2015 8:56:54 AM CDT, "Eric V. Smith" <eric@trueblade.com> wrote:
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 10:08 AM, Ryan Gonzalez wrote:
You need to use format_map (or **dict(...)). And ChainMap might be a better choice, it would take some benchmarking to know. Also, you don't get builtins using this approach. I'm using eval to exactly match what evaluating the variable in the parent context would give you. That might not matter depending on the actual requirements. But I agree there are multiple ways to do this, and several of them could be made to work. Mine might have fatal flaws that more testing would show. Eric.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 10:19 AM, Eric V. Smith wrote:
My quick testing comes up with this, largely based on the code by joejev: import sys import collections def f(str): frame = sys._getframe(1) return str.format_map(collections.ChainMap( frame.f_locals, frame.f_globals, frame.f_globals['__builtins__'].__dict__)) I'm not sure about the builtins, but this seems to work. Also, you might want to be able to pass in the frame depth to allow this to be callable more than 1 level deep. So, given that this is all basically possible to implement today (at the cost of using sys._getframe()), I'm -1 on adding any compiler tricks to support this via syntax. From what I know of PyPy, this should be supported there, albeit at a large performance cost. Eric.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Perhaps surprisingly, I find myself leaning in favor of the f'...{var}...' form. It is explicit in the variable name. Historically, the `x` notation as an alias for repr(x) was meant to play this role -- you'd write '...' + `var` + '...', but it wasn't brief enough, and the `` are hard to see. f'...' is more explicit, and can be combined with r'...' and b'...' (or both) as needed. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 01:25 PM, Guido van Rossum wrote:
We didn't implement b''.format(), for a variety of reasons. Mostly to do with user-defined types returning unicode from __format__, if I recall correctly. So the idea is that f'x:{a.x} y:{y}' would translate to bytecode that does: 'x:{a.x} y:{y}'.format(a=a, y=y) Correct? I think I could leverage _string.formatter_parser() to do this, although it's been a while since I wrote that. And I'm not sure what's available at compile time. But I can look into it. I guess the other option is to have it generate: 'x:{a.x} y:{y}'.format_map(collections.ChainMap(globals(), locals(), __builtins__)) That way, I wouldn't have to parse the string to pick out what variables are referenced in it, then have .format() parse it again. Eric.
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Eric V. Smith wrote:
That's exactly what I had in mind, at least. Indexing is supported in format strings too, so f'{a[1]}' also becomes '{a[1]}'.format(a=a), but I don't think there are any other strange cases here. I would vote for f'{}' or f'{0}' to just be a SyntaxError. I briefly looked into how this would be implemented and while it's not quite trivial/localized, it should be relatively straightforward if we don't allow implicit merging of f'' strings. If we wanted to allow implicit merging then we'd need to touch more code, but I don't see any benefit from allowing it at all, let alone enough to justify seriously messing with this part of the parser.
If you really want to go with the second approach, ChainMap isn't going to be sufficient, for example:
If the change also came with a dict-like object that will properly resolve variables from the current scope, that would be fine, but I don't think it can be constructed in terms of existing names. (Also bear in mind that other Python implementations do not necessarily provide sys._getframe(), so defining the lookup in terms of that would not be helpful either.) Cheers, Steve
Eric.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
(Our posts crossed, to some extent.) On Mon, Jul 20, 2015 at 8:41 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
[...]
+1 on that last sentence. But I prefer a slightly different way of implementing (see my reply to Eric).
Not sure what you mean by "implicit merging" -- if you mean literal concatenation (e.g. 'foo' "bar" == 'foobar') then I think it should be allowed, just like we support mixing quotes and r''. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 03:22 PM, Guido van Rossum wrote:
Right. And following up here to that email:
That is better. The trick is converting the string "a.x" to the expression a.x, which should be easy enough at compile time.
It would still probably be best to limit the syntax inside {} to exactly what regular .format() supports, to avoid confusing users.
The expressions supported by .format() are limited to attribute access and "indexing". We just need to enforce that same restriction here.
It would be easiest to not restrict the expressions, but then we'd have to maintain that restriction in two places. And now that I think about it, it's somewhat more complex than just expanding the expression. In .format(), this: '{a[0]}{b[c]}' is evaluated roughly as format(a[0]) + format(b['c']) So to be consistent with .format(), we have to fully parse at least the indexing out to see if it looks like a constant integer or a string. So given that, I think we should just support what .format() allows, since it's really not quite as simple as "evaluate the expression inside the braces".
If I understand it, I think the concern is: f'{a}{b}' 'foo{}' f'{c}{d}' would need to become: f'{a}{b}foo{{}}{c}{d}' So you have to escape the braces in non-f-strings when merging strings and any of them are f-strings, and make the result an f-string. But I think that's the only complication.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 03:52 PM, Eric V. Smith wrote:
And thinking about it yet some more, I think the easiest and most consistent thing to do would be to translate it like: f'{a[0]}{b[c]}' == '{[0]}{[c]}'.format(a, b) So: f'api:{sys.api_version} {a} size{sys.maxsize}' would become either: f'api:{.api_version} {} size{.maxsize}'.format(sys, a, sys) or f'api:{0.api_version} {1} size{0.maxsize}'.format(sys, a) The first one seems simpler. The second probably isn't worth the micro-optimization, and it may even be a pessimization. Eric.
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Mon, Jul 20, 2015 at 11:41 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Maybe I'm missing something, but it seems this could just as reasonably be '{}'.format(a[1])? Is there a reason to prefer the other form over this? On Mon, Jul 20, 2015 at 1:20 PM, Eric V. Smith <eric@trueblade.com> wrote:
Or: f'api:{} {} size{}'.format(sys.api_version, a, sys.maxsize) Note that format strings don't allow variables in subscripts, so f'{a[n]}' ==> '{}'.format(a['n']) Also, the discussion has assumed that if this feature were added it necessarily must be a single character prefix. Looking at the grammar, I don't see that as a requirement as it explicitly defines multiple character sequences. A syntax like: format'a{b}c' formatted"""a{b} c""" might be more readable. There's no namespace conflict just as there is no conflict between raw string literals and a variable named r. --- Bruce Check out my new puzzle book: http://J.mp/ingToConclusions Get it free here: http://J.mp/ingToConclusionsFree (available on iOS)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/20/2015 5:29 PM, Bruce Leban wrote:
Right. But why re-implement that, instead of making it: '{[n]}'.format(a)? I've convinced myself (and maybe no one else) that since you want this: a=[1,2] b={'c':42} f'{a[0]} {b[c]}' being the same as: '{} {}'.format(a[0], b['c']) that it would be easier to make it: '{[0]} {[c]}'.format(a, b) instead of trying to figure out that the numeric-looking '0' gets converted to an integer, and the non-numeric-looking 'c' gets left as a string. That logic already exists in str.format(), so let's just leverage it from there. It also means that you automatically will support the subset of expressions that str.format() already supports, with all of its limitations and quirks. But I now think that's a feature, since str.format() doesn't really support the same expressions as normal Python does (due to the [0] vs. ['c'] issue). And it's way easier to explain if f-strings support the identical syntax as str.format(). The only restriction is that all parameters must be named, and not numbered or auto-numbered. Eric.
data:image/s3,"s3://crabby-images/b8491/b8491be6c910fecbef774491deda81cc5f10ed6d" alt=""
Eric V. Smith wrote:
Right. But why re-implement that, instead of making it: '{[n]}'.format(a)?
Consider also the case of custom formatters. I've got one that overloads format_field, adds a units specifier in the format, which then uses our model units conversion and writes values in the current user-units of the system: x = body.x_coord # A "Double()" object with units of length. print(f'{x:length:.3f}') # Uses the "length" string to perform a units conversion much as "!r" would invoke "repr()". I think your proposal above handles my use case the most cleanly. Another Eric
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Eric V. Smith writes:
Yes, please! Guido's point that he wants no explicit use of locals(), etc, in the implementation took me a bit of thought to understand, but then I realized that it means a "macro" transformation with the resulting expression evaluated in the same environment as an explicit .format() would be. And that indeed makes the whole thing as explicit as invoking str.format would be. I don't *really* care what transformations are used to get that result, but DRYing this out and letting the __format__ method of the indexed object figure out the meaning of the format string makes me feel better about my ability to *think* about the meaning of an f"..." string. In particular, isn't it possible that a user class's __format__ might decide that *all* keys are strings? I don't see how the transformation Steve Dower proposed can possibly deal with that ambiguity. Another conundrum is that it's not obvious whether f"{a[01]}" is a SyntaxError (as it is with str.format) or equivalent to "{}".format(a['01']) (as my hypothetical user's class would expect).
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/21/2015 1:16 AM, Stephen J. Turnbull wrote:
Right. That is indeed the beauty of the thing. I now think locals(), etc. is a non-starter.
In today's world: '{a[0]:4d}'.format(a=a) the object who's __format__() method is being called is a[0], not a. So it's not up to the object to decide what the keys mean. That decision is being made by the ''.format() implementation. And that's also the way I'm envisioning it with f-strings.
It would still be a syntax error, in my imagined implementation, because it's really calling ''.format() to do the expansion. So here's what I'm thinking f'some-string' would expand to. As you note above, it's happening in the caller's context: new_fmt = remove_all_object_names_from_string(s) objs = find_all_objects_referenced_in_string(s) result = new_fmt.format(*objs) So given: X = namedtuple('X', 'name width') a = X('Eric', 10) value = 'some value' then: f'{a.name:*^{a.width}}:{value}' would become this transformed code: '{.name:*^{.width}}:{}'.format(*[a, a, value]) which would evaluate to: '***Eric***:some value' The transformation of the f-string to new_fmt and the computation of objs is the only new part. The transformed code above works today. Eric.
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 4:20 PM, Eric V. Smith <eric@trueblade.com> wrote:
I think Python can do more at compile time and translate f"Result1={expr1:fmt1};Result2={expr2:fmt2}" to bytecode equivalent of "Result1=%s;Result2=%s" % ((expr1).__format__(fmt1), (expr2).__format__(fmt2))
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
I'd rather keep the transform as simple as possible. If text formatting is your bottleneck, congratulations on fixing your network, disk, RAM and probably your users. Those who need to micro-optimize this code can do what you suggested by hand - there's no need for us to make our lives more complicated for the straw man who has a string formatting bottleneck and doesn't know enough to research another approach. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Alexander Belopolsky<mailto:alexander.belopolsky@gmail.com> Sent: 7/20/2015 18:40 To: Eric V. Smith<mailto:eric@trueblade.com> Cc: python-ideas<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On Mon, Jul 20, 2015 at 4:20 PM, Eric V. Smith <eric@trueblade.com<mailto:eric@trueblade.com>> wrote: And thinking about it yet some more, I think the easiest and most consistent thing to do would be to translate it like: f'{a[0]}{b[c]}' == '{[0]}{[c]}'.format(a, b) I think Python can do more at compile time and translate f"Result1={expr1:fmt1};Result2={expr2:fmt2}" to bytecode equivalent of "Result1=%s;Result2=%s" % ((expr1).__format__(fmt1), (expr2).__format__(fmt2))
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 10:10 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
If text formatting is your bottleneck, congratulations on fixing your network, disk, RAM and probably your users.
Thank you, but one of my servers just spent 18 hours loading 10GB of XML data into a database. Given that CPU was loaded 100% all this time, I suspect neither network nor disk and not even RAM was the bottleneck. Since XML parsing was done by C code and only formatting of database INSERT instructions was done in Python, I strongly suspect string formatting had a sizable carbon footprint in this case. Not all string formatting is done for human consumption.
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 21, 2015 at 12:44 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
Well-known rule of optimization: Measure, don't assume. There could be something completely different that's affecting your performance. I'd be impressed and extremely surprised if the formatting of INSERT queries took longer than the execution of those same queries, but even if that is the case, it could be the XML parsing (just because it's in C doesn't mean it's inherently faster than any Python code), or the database itself, or suboptimal paging of virtual memory. Before pointing fingers anywhere, measure. Measure. Measure! ChrisA
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 10:53 PM, Chris Angelico <rosuav@gmail.com> wrote:
This is getting off-topic for this list, but you may indeed be surprised by the performance that kdb+ (kx.com) with PyQ (pyq.enlnt.com) can deliver. [Full disclosure: I am the author of PyQ, so sorry for a shameless plug.]
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Sounds like you deserve the congratulations then :) But when you've confirmed that string formatting is something that can be changed to improve performance (specifically parsing the format string in this case), you have options regardless of the default optimization. For instance, you probably want to preallocate a list, format and set each non-string item, then use .join (or if possible, write directly from the list without the intermediate step of producing a single string). Making f"" strings subtly faster isn't going to solve your performance issue, and while I'm not advocating wastefulness, this looks like a premature optimization, especially when put alongside the guaranteed heap allocations and very likely IO that are also going to occur. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Alexander Belopolsky<mailto:alexander.belopolsky@gmail.com> Sent: 7/20/2015 19:44 To: Steve Dower<mailto:Steve.Dower@microsoft.com> Cc: Eric V. Smith<mailto:eric@trueblade.com>; python-ideas<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On Mon, Jul 20, 2015 at 10:10 PM, Steve Dower <Steve.Dower@microsoft.com<mailto:Steve.Dower@microsoft.com>> wrote: If text formatting is your bottleneck, congratulations on fixing your network, disk, RAM and probably your users. Thank you, but one of my servers just spent 18 hours loading 10GB of XML data into a database. Given that CPU was loaded 100% all this time, I suspect neither network nor disk and not even RAM was the bottleneck. Since XML parsing was done by C code and only formatting of database INSERT instructions was done in Python, I strongly suspect string formatting had a sizable carbon footprint in this case. Not all string formatting is done for human consumption.
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 11:16 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
One thing I know for a fact is that the use of % formatting instead of .format makes a significant difference in my applications. This is not surprising given these timings: $ python3 -mtimeit "'%d' % 2" 100000000 loops, best of 3: 0.00966 usec per loop $ python3 -mtimeit "'{}'.format(2)" 1000000 loops, best of 3: 0.216 usec per loop As a result, my rule of thumb is to avoid the use of .format in anything remotely performance critical. If f"" syntax is implemented as a sugar for .format - it will be equally useless for most of my needs. However, I think it can be implemented in a way that will make me consider switching away from % formatting.
data:image/s3,"s3://crabby-images/28d63/28d63dd36c89fc323fc6288a48395e44105c3cc8" alt=""
[Alexander Belopolsky]
Well, be sure to check what you're actually timing. Here under Python 3.4.3:
That is, the peephole optimizer got rid of "%d" % 2 entirely, replacing it with the string constant "2". So, in all, it's more surprising that it takes so long to load a constant ;-)
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 21, 2015 at 1:35 PM, Tim Peters <tim.peters@gmail.com> wrote:
Interesting that the same optimization can't be done on the .format() version - it's not as if anyone can monkey-patch str so it does something different, is it? To defeat the optimization, I tried this: rosuav@sikorsky:~$ python3 -mtimeit -s "x=2" "'%d' % 2" 100000000 loops, best of 3: 0.0156 usec per loop rosuav@sikorsky:~$ python3 -mtimeit -s "x=2" "'%d' % x" 10000000 loops, best of 3: 0.162 usec per loop rosuav@sikorsky:~$ python3 -mtimeit -s "x=2" "'{}'.format(2)" 1000000 loops, best of 3: 0.225 usec per loop rosuav@sikorsky:~$ python3 -mtimeit -s "x=2" "'{}'.format(x)" 1000000 loops, best of 3: 0.29 usec per loop The difference is still there, but it's become a lot less dramatic - about two to one. I think that's the honest difference between them, and that's not usually going to be enough to make any sort of significant difference. ChrisA
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 11:35 PM, Tim Peters <tim.peters@gmail.com> wrote:
Hmm. I stand corrected: $ python3 -mtimeit -s "a=2" "'%s' % a" 10000000 loops, best of 3: 0.124 usec per loop $ python3 -mtimeit -s "a=2" "'{}'.format(a)" 1000000 loops, best of 3: 0.215 usec per loop it is 2x rather than 20x speed difference.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/21/2015 12:02 AM, Alexander Belopolsky wrote:
The last time I looked at this, the performance difference was the lookup of "format" on a string object. Although maybe that's not true, and the problem is really function call overhead: $ python3 -mtimeit -s 'a=2' 'f="{}".format' 'f(a)' 1000000 loops, best of 3: 0.227 usec per loop $ python3 -mtimeit -s "a=2" "'%s' % a" 10000000 loops, best of 3: 0.126 usec per loop There is (or was) a special case for formatting str, int, and float to bypass the .__format__ lookup. I haven't looked at it since the PEP 393 work. Eric.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/21/2015 12:25 AM, Eric V. Smith wrote:
Oops, that should have been: $ python3 -mtimeit -s 'a=2; f="{}".format' 'f(a)' 1000000 loops, best of 3: 0.19 usec per loop $ python3 -mtimeit -s "a=2" "'%s' % a" 10000000 loops, best of 3: 0.138 usec per loop So, about 40% slower if you can get rid of the .format lookup, which we can do with f-strings. Because it's more flexible, .format is just never going to be as fast as %-formatting. But there's no doubt room for improvement. The recursive nature of: f'{o.name:{o.len}}' will complicate some of the optimizations I've been thinking of.
Looks like it's still there. Also for complex! Eric.
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
That's almost certainly something that can be improved though, and maybe it's worth some investment from you. Remember, Python doesn't get better by magic - it gets better because someone gets annoyed enough about it that they volunteer to fix it (at least, that's how I ended up getting so involved :) ). My wild guess is that calling int.__format__ is the slow part, though I'd have hoped that it wouldn't be any slower for default formatting... guess not. We've got sprints coming up at PyData next week, so maybe I'll try and encourage someone to take a look and see what can be improved here. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Alexander Belopolsky<mailto:alexander.belopolsky@gmail.com> Sent: 7/20/2015 20:28 To: Steve Dower<mailto:Steve.Dower@microsoft.com> Cc: Eric V. Smith<mailto:eric@trueblade.com>; python-ideas<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On Mon, Jul 20, 2015 at 11:16 PM, Steve Dower <Steve.Dower@microsoft.com<mailto:Steve.Dower@microsoft.com>> wrote: Making f"" strings subtly faster isn't going to solve your performance issue, and while I'm not advocating wastefulness, this looks like a premature optimization, especially when put alongside the guaranteed heap allocations and very likely IO that are also going to occur. One thing I know for a fact is that the use of % formatting instead of .format makes a significant difference in my applications. This is not surprising given these timings: $ python3 -mtimeit "'%d' % 2" 100000000 loops, best of 3: 0.00966 usec per loop $ python3 -mtimeit "'{}'.format(2)" 1000000 loops, best of 3: 0.216 usec per loop As a result, my rule of thumb is to avoid the use of .format in anything remotely performance critical. If f"" syntax is implemented as a sugar for .format - it will be equally useless for most of my needs. However, I think it can be implemented in a way that will make me consider switching away from % formatting.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Mon, Jul 20, 2015 at 9:52 PM, Eric V. Smith <eric@trueblade.com> wrote:
I wonder if we could let the parser do this? consider f'x:{ as one token and so on?
Oooh, this is very unfortunate. I cannot support this. Treating b[c] as b['c'] in a "real" format string is one way, but treating it that way in an expression is just too weird.
Alas. And this is probably why we don't already have this feature.
That's possible; another possibility would be to just have multiple .format() calls (one per f'...') and use the + operator to concatenate the pieces. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/21/2015 2:05 AM, Guido van Rossum wrote:
I think you're right here, and my other emails were trying too much to simplify the implementation and keep the parallels with str.format(). The difference between str.format() and f-strings is that in str.format() you can have an arbitrarily complex expression as the passed in argument to .format(). With f-strings, you'd be limited to just what can be extracted from the string itself: there are no arguments to be passed in. So maybe we do want to allow arbitrary expressions inside the f-string. For example: '{a.foo}'.format(a=b[c]) If we limit f-strings to just what str.format() string expressions can represent, it would be impossible to represent this with an f-string, without an intermediate assignment. But if we allowed arbitrary expressions inside an f-string, then we'd have: f'{b[c].foo}' and similarly: '{a.foo}'.format(a=b['c']) would become: f'{b["c"].foo}' But now we'd be breaking compatibility with str.format(). Maybe it's worth it, though. I can see 80% of the uses of str.format() being replaced by f-strings. The remainder would be cases where format strings are passed in to other functions. I do this a lot with custom logging [1]. The implementation complexity goes up by allowing arbitrary expressions. Not that that is necessarily a reason to drive a design decision. For example: f'{a[2:3]:20d}' We need to extract the expression "a[2:3]" and the format spec "20d". I can't just scan for a colon any more, I've got to actually parse the expression until I find a "}", ":", or "!" that's not part of the expression so that I know where it ends. But since it's happening at compile time, I surely have all of the tools at my disposal. I'll have to look through the grammar to see what the complexities here are and where this would fit in.
Agreed. So I think it's either "don't be compatible with str.format expressions" or "abandon the proposed f-strings".
Right. I think the application would actually use _PyUnicodeWriter to build the string up, but it would logically be equivalent to: 'foo ' f'b:{b["c"].foo:20d} is {on_off}' ' bar' becoming: 'foo' + 'b:' + format(b["c"].foo, '20d') + ' is ' + format(on_off) + ' bar' At this point, the implementation wouldn't call str.format() because it's not being used to evaluate the expression. It would just call format() directly. And since it's doing that without having to look up .format on the string, we'd get some performance back that str.format() currently suffers from. Nothing is really lost by not merging the adjacent strings, since the f-strings by definition are replaced by function calls. Maybe the optimizer could figure out that 'foo ' + 'b:' could be merged in to 'foo b:'. Or maybe the user should refactor the strings if it's that important. I'm out of the office all day and won't be able to respond to any follow ups until later. But that's good, since I'll be forced to think before typing! Eric. [1] Which makes me think of the crazy idea of passing in unevaluated f-strings in to another function to be evaluated in their context. But the code injection opportunities with doing this with arbitrary user-specified strings are just too scary to think about. At least with str.format() you're limited in to what the expressions can do. Basically indexing and attribute access. No function calls: '{.exit()}'.format(sys) !
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Thanks, Eric! You're addressing all my concerns and you're going exactly where I wanted this to go. I hope that you will find the time to write up a PEP; take your time. Regarding your [1], let's not consider unevaluated f-strings as a feature; that use case is sufficiently covered by the existing str.format(). On Tue, Jul 21, 2015 at 1:58 PM, Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/21/2015 9:03 AM, Guido van Rossum wrote:
Thanks, Guido. I'd already given some thought to a PEP. I'll work on it. I don't have a ton of free time, but I'd like to at least get the ideas presented so far written down. One thing I haven't completely thought through is nested expressions: f'{value:.{precision}f}' I guess this would just become: format(value, '.' + format(precision) + 'f') If I recall correctly, we only support recursive fields in the format specifier portion, and only one level deep. I'll need to keep that in mind. If this gets accepted, I'll have to speed up my own efforts to port my code to 3.x. Eric.
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 21 July 2015 at 21:58, Eric V. Smith <eric@trueblade.com> wrote:
Yeah, this is why I think anything involving implicit interpolation needs to be transparent to the compiler: the security implications with anything other than literal format strings or some other explicitly compile time operation are far too "exciting" otherwise. I wonder though, if we went with the f-strings idea, could we make them support a *subset* of the "str.format" call syntax, rather than a superset? What if they supported name and attribute lookup syntax, but not positional or subscript lookup? They'd still be a great for formatting output in scripts and debugging messages, but more complex formatting cases would still involve reaching for str.format, str.format_map or exec("print(f'{this} is an odd way to do a {format_map} call')", namespace). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jul 21, 2015 at 3:05 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't know. Either way there's going to be complaints about the inconsistencies. :-( I wish we hadn't done the {a[x]} part of PEP 3101, but it's too late now. :-(
You lost me there (probably by trying to be too terse). -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/5f8b2/5f8b2ad1b2b61ef91eb396773cce6ee17c3a4eca" alt=""
On Tue, 21 Jul 2015 at 14:14 Nick Coghlan <ncoghlan@gmail.com> wrote:
Please don't do either. Python already has a surplus of string formatting mini-languages. Making a new one that is similar but not the same as one of the others is a recipe for confusion as well as an additional learning burden for new users of the language. -- Oscar
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jul 21, 2015 at 6:50 PM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I'm not sure if you meant it this way, but if we really believed that, the only way to avoid confusion would be not to introduce f'' strings at all. (Which, BTW is a valid outcome of this discussion -- even if a PEP is written it may end up being rejected.) Personally I think that the different languages are no big deal, since realistically the far majority of use cases will use simple variables (e.g. foo) or single attributes (e.g. foo.bar). Until this discussion I had totally forgotten several of the quirks of PEP 3101, including: a[c] meaning a['c'] elsewhere; the ^ format character and the related fill/align feature; nested substitutions; the top-level format() function. Also, I can never remember how to use !r. I actually find quite unfortunate that the formatting mini-language gives a[c] the meaning of a['c'] elsewhere, since it means that the formatting mini-language to reference variables is neither a subset nor a superset of the standard expression syntax. We have a variety of other places in the syntax where a slightly different syntax is supported (e.g. it's quite subtle how commas are parsed, and decorators allow a strict subset of expressions) but the formatting mini-language is AFAIR the only one that gives a form that is allowed elsewhere a different meaning. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
My apologies, as I've read through this thread again and haven't found the reason the approach last mentioned by Eric S. was abandoned: f'{a[name]}' ==> '{[name]}'.format(a) This seemed to solve things neatly, letting .format() handle the messy details it already handles. Also it's easy to lift the identifier names out using their rules. Simple implementation, easy to understand. Then conversation switched to this alternative: f'{a[name]}' ==> '{}'.format(a['name']) Which has the drawback that some of the complexity of the mini-language will need to be reimplemented. Second, there is an inconsistency in quoting of string dictionary keys. That's unfortunate, but the way format currently works. Since f'' will be implemented on top, is not the quoting issue orthogonal to it? If the unquoted str dict key is indeed unacceptable I submit it should be deprecated (or not) separately, but not affect the decision on f''. Again though, I feel like I'm missing an important nugget of information. -Mike On 07/21/2015 10:58 AM, Guido van Rossum wrote:
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Mike Miller writes:
That's what I thought, too, but according to https://mail.python.org/pipermail/python-ideas/2015-July/034728.html, that's not true. The problem is that .format accepts *arbitrary* expressions as arguments, eg "{a.attr}".format(a=f()), which can't be expressed as an f-string within the limits of current .format specs. Finally Eric concludes that you end up with a situation where format would need to be called directly, and str.format isn't involved at all I haven't studied the argument that leads there, but that's the context you're looking for, I believe. Python-Ideas meta: The simple implementation is surely still on the table, although I get the feeling Guido is unhappy with the restrictions implied. However, it is unlikely to be discussed again here precisely because those who understand the implementation of str.format well already understand the implications of this implementation very well -- further discussion is unnecessary. In fact, Guido asking for a PEP may put a "paragraph break" into this discussion at this point -- we have several proposed implementations with various amounts of flexibility, and the proponents understand them even if I, and perhaps you, don't. What's left is the grunt work of thinking out the corner cases and creating one or more proof-of- concept implementations, then writing the PEP. Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Wed, Jul 22, 2015 at 9:04 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
So I guess the question is: Does f"..." have to be implemented on top of str.format, or should it be implemented separately on top of object.__format__? The two could be virtually indistinguishable anyway. Something like this: loc = "world" print(f"Hello, {loc}!") # becomes loc = "world" print("Hello, "+loc.__format__("")+"!") # maybe with the repeated concat optimized to a join With that, there's no particular reason for the specifics of .format() key lookup to be retained. Want full expression syntax? Should be easy - it's just a matter of getting the nesting right (otherwise it's a SyntaxError, same as (1,2,[3,4) would be). Yes, it'll be a bit harder for simplistic parsers to work with, but basically, this is no longer a string literal - it's a compact syntax for string formatting and concatenation, which is something I can definitely get behind. REXX allowed abuttal for concatenation, so you could write something like this: msg = "Hello, "loc"!" Replace those interior quotes with braces, and you have an f"..." string. It's not a string, it's an expression, and it can look up names in its enclosing scope. Describe it alongside list comprehensions, lambda expressions, and so on, and it fits in fairly nicely. No longer -0.5 on this. ChrisA
data:image/s3,"s3://crabby-images/52bd8/52bd80b85ad23b22cd55e442f406b4f3ee8efd9f" alt=""
Pretty sure I'm going to be odd one out, here... I don't like most of Ruby, but, after using Crystal and CoffeeScript, I have fallen in love with #{}. It gives the appearance of a real expression, not just a format placeholder. Like: f'a#{name}b' On July 21, 2015 7:15:21 PM CDT, Chris Angelico <rosuav@gmail.com> wrote:
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
data:image/s3,"s3://crabby-images/52bd8/52bd80b85ad23b22cd55e442f406b4f3ee8efd9f" alt=""
On July 21, 2015 7:30:02 PM CDT, Chris Angelico <rosuav@gmail.com> wrote:
I'm referring to the syntax, though. It makes it visually distinct from normal format string placeholders. Also: plain format strings are a *pain* to escape in code generators, e.g.: print('int main() {{ return 0+{name}; }}'.format(name)) Stupid double brackets. That is why I still use % formatting. #{} isn't a common expression to ever use. I have never actually printed a string that contains #{} (except when using interpolation in CoffeeScript/Crystal).
-- Sent from my Android device with K-9 Mail. Please excuse my brevity. Currently listening to: Deep Drive by Yoko Shimomura (KH 2.5)
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Guido van Rossum wrote:
(Our posts crossed, to some extent.)
Yeah, mine delayed about 20 minutes after I sent it before I saw the list send it out. Not sure what's going on there...
Yep, saw that and after giving it some thought I agree. Initially I liked the cleanliness of not modifying the original string, but the transform does seem easier to explain as "lifting" each expression out of the string (at least compared to "lifting the first part of each expression and combining duplicates and assuming they are always the same value"). One catch here is that '{a[b]}' has to transform to '{}'.format(a['b']) and not .format(a[b]), which is fine but an extra step. IIRC you can only use ints and strs as keys in a format string.
Except we don't really have a literal now - it's an expression. Does f"{a}" "{b}" become "{}{}".format(a, b), "{}".format(a) + "{b}" or "{}{{b}}".format(a)? What about f"{" f"{a}" f"}"? Should something that looks like literal concatenation silently become runtime concatenation? Yes, it's possible to answer and define all of these, but I don't see how it adds value (though I am one of those people who never use literal concatenation and advise others not to use it either), and I see plenty of ways it would unnecessarily extend discussion and prevent actually getting something done. Cheers, Steve
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 03:22 PM, Guido van Rossum wrote:
Do we really want to support this? It complicates the implementation, and I'm not sure of the value. f'{foo}' 'bar' f'{baz}' becomes something like: format(foo) + 'bar' + format(baz) You're not merging similar things, like you are with normal string concatenation. And merging f-strings: f'{foo}' f'{bar'} similarly just becomes concatenating the results of some function calls. I guess it depends if you think of an f-string as a string, or an expression (like the function calls it will become). I don't have a real strong preference, but I'd like to get it ironed out logically before doing a trial implementation. Eric.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/22/2015 4:21 PM, MRAB wrote:
True, but f-strings aren't string literals. They're expressions disguised as string literals.
While they look alike, they're not at all similar. Nothing is being merged, since the f-string is being evaluated at runtime, not compile time. I'm not sure if it would be best to hide this runtime string concatenation behind something that looks like it has less of a cost. At runtime, it's likely going to look something like: ''.join([f'foo', 'bar']) although using _PyUnicodeWriter, I guess. Eric.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
Executive summary for those in a hurry: * implicit concatenation of strings *of any type* always occurs at compile-time; * if the first (or any?) of the concat'ed fragments begin with an f prefix, then the resulting concatenated string is deemed to begin with an f prefix and is compiled to a call to format (or some other appropriate implementation), which is a run-time operation; * the peep-hole optimizer has to avoid concat'ing mixed f and non-f strings: f'{spam}' + '{eggs}' should evaluate to something like (format(spam) + '{eggs}'). Longer version with more detail below. On Wed, Jul 22, 2015 at 02:52:30PM -0400, Eric V. Smith wrote:
I would not want or expect that behaviour. However, I would want and expect that behaviour with *explicit* concatenation using the + operator. I would want the peephole optimizer to avoid optimizing this case: f'{foo}' + 'bar' + f'{baz}' and allow it to be compiled to something like: format(foo) + 'bar' + format(baz) With explicit concatenation, the format() calls occur before the + operators are called. Constant-folding 'a' + 'b' to 'ab' is an optimization, it doesn't change the semantics of the concat. But constant-folding f'{a}' + '{b}' would change the semantics of the concatenation, because f strings aren't constants, they only look like them. In the case of *implicit* concatenation, I think that the concatenations should occur first, at compile time. Yes, that deliberately introduces a difference between implicit and explicit concatenation, that's a feature, not a bug! Implicit concatenation will help in the same cases that implicit concatenation usually helps: long strings without newlines: msg = (f'a long message here blah blah {x}' f' and {y} and {z} and more {stuff} and {things}' f' and perhaps even more {poppycock}' ) That should be treated as syntactically equivalent to: msg = f'a long message here blah blah {x} and {y} and {z} and more {stuff} and {things} and perhaps even more {poppycock}' which is then compiled into the usual format(...) magic, as normal. So, a very strong +1 on allowing implicit concatenation. I would go further and allow all the f prefixes apart from the first to be optional. To put it another way, the first f prefix "infects" all the other string fragments: msg = (f'a long message here blah blah {x}' ' and {y} and {z} and more {stuff} and {things}' ' and perhaps even more {poppycock}' ) should be exactly the same as the first version. My reasoning is that the implicit concatenation always occurs first, so by the time the format(...) magic occurs at run-time, the interpreter no long knows which braces came from an f-string and which came from a regular string. (Implicit concatenation is a compile-time operation, the format(...) stuff is run-time, so there is a clear and logical order of operations.) To avoid potential surprises, I would disallow the case where the f prefix doesn't occur in the first fragment, or at least raise a compile-time warning: 'spam' 'eggs' f'{cheese}' should raise or warn. (That restriction could be removed in the future, if it turns out not to be a problem.)
That's safe to do at compile-time: f'{foo}' f'{bar}' f'{foo}{bar}' will always be the same. There's no need to delay the concat until after the formats. -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Thu, Jul 23, 2015 at 1:31 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Editors/IDEs would have to be taught about this (and particularly, taught to show f"..." very differently from "..."), because otherwise a missed comma could be very surprising: lines = [ "simple string", "string {with} {braces}", f"formatted string {using} {interpolation}" # oops, missed a comma "another string {with} {braces}" ] Could be a pain to try to debug that one, partly because stuff is happening at compile time, so you can't even pretty-print lines immediately after assignment to notice that there are only three elements rather than four. That solved, though, I think you're probably right about the f prefix infecting the remaining fragments. It'd be more consistent to have it *not* infect, the same way that an r or u/b prefix doesn't infect subsequent snippets; but if anyone's bothered by it, they can always stick in a few plus signs. ChrisA
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I'm against f'' infecting subsequent "literals". After all, r'' or b'' don't infect their neighbors.
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Wed, Jul 22, 2015 at 8:31 PM, Steven D'Aprano <steve@pearwood.info> wrote:
It doesn't have to change semantics and it shouldn't. This is a strawman argument. While we could do it wrong, why would we? It's hardly difficult to quote the non-format string while still optimizing the concatenation. That is, f'{foo}' '{bar}' f'{oof}' could compile to the same thing as if you wrote: f'{foo}{{bar}}{oof}' the result of something like this: COMPILE_TIME_FORMAT_TRANSFORM('{foo}' + COMPILE_TIME_ESCAPE('{bar}') + '{baz'}) This is analogous what happens with mixing raw and non-raw strings: r'a\b' 'm\n' r'x\y' is the same as if you wrote: 'a\\bm\nx\\y' or r'''a\bm x\y''' In the case of *implicit* concatenation, I think that the concatenations
Doing the concatenation at compile time does NOT require the "infected" behavior you describe below as noted above.
(Implicit concatenation is a compile-time operation, the format(...) stuff is run-time, so there is a clear and logical order of operations.)
To you, maybe. To the average developer, I doubt it. I view the compile time evaluation of implicit concatenation as a compiler implementation detail as it makes essentially no difference to the semantics of the program. (Yes, I know that runtime concatenation *might* produce a different string object each time through the code but it doesn't have to. I hope you don't write programs that depend on the presence or absence of string pooling.)
Just as it's safe to concat strings after escaping the non-format ones. There is one additional detail. I think it should be required that each format string stand on its own. That is: f'x{foo' f'bar}y' should be an error and not the equivalent of f'x{foobar}y' --- Bruce Check out my new puzzle book: http://J.mp/ingToConclusions <http://j.mp/ingToConclusions> Get it free here: http://J.mp/ingToConclusionsFree <http://j.mp/ingToConclusionsFree> (available on iOS)
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Wed, Jul 22, 2015 at 09:28:19PM -0700, Bruce Leban wrote:
If I had a dollar for everytime somebody on the Internet misused "strawman argument", I would be a rich man. Just because you disagree with me or think I'm wrong doesn't make my argument a strawman. It just makes me wrong-headed, or wrong :-) I'm having trouble understand what precisely you are disagreeing with. The example I give which you quote involves explicit concatenation with the + operator, but your examples below use implicit concatenation with no operator at all. Putting aside the question of implementation, I think: (1) Explicit concatenation with the + operator should be treated as occuring after the f strings are evaluated, *as if* the following occurs: f'{spam}' + '{eggs}' => compiles to format(spam) + '{eggs}' If you can come up with a clever optimization that avoids the need to *actually* build two temporary strings and then concatenate them, I don't have a problem with that. I'm only talking about the semantics. I don't want this: f'{spam}' + '{eggs}' => compiles to format(spam) + format(eggs) # not this! Do you agree with those semantics for explicit + concatenation? If not, what behaviour do you want? (2) Implicit concatenation should occur as early as possible, before the format. Take the easy case first: both fragments are f-strings. f'{spam}' f'{eggs}' => behaves as if you wrote f'{spam}{eggs}' => which compiles to format(spam) + format(eggs) Do you agree with those semantics for implicit concatenation? (3) The hard case, when you mix f and non-f strings. f'{spam}' '{eggs}' Notwithstanding raw strings, the behaviour which makes sense to me is that the implicit string concatenation occurs first, followed by format. So, semantically, if the parser sees the above, it should concat the string: => f'{spam}{eggs}' then transform it to a call to format: => format(spam) + format(eggs) I described that as the f "infecting" the other string. Guido has said he doesn't like this, but I'm not sure what behaviour he wants instead. I don't think I want this behaviour: f'{spam}' '{eggs}' => format(spam) + '{eggs}' for two reasons. Firstly, I already have (at least!) one way of getting that behaviour, such as explicit + concatenation as above. Secondly, it feels that this does the concatenation in the wrong order. Implicit concatenation occurs as early as possible in every other case. But here, we're delaying the concatenation until after the format. So this feels wrong to me. (Again, I'm talking semantics, not implementation. Clever tricks with escaping the brackets don't matter.) If there's no consensus on the behaviour of mixed f and non-f strings with implicit concatenation, rather than pick one and frustrate and surprise half the users, we should make it an error: f'{spam}' '{eggs}' => raises SyntaxError and require people to be explicit about what they want, e.g.: f'{spam}' + '{eggs}' # concatenation occurs after the format() f'{spam}' f'{eggs}' # implicit concatenation before format() (for the avoidance of doubt, I don't care whether the concatenation *actually* occurs after the format, I'm only talking about semantics, not implementation, sorry to keep beating this dead horse).
I don't think we can look at strings in isolation line-by-line. s = r'''This is a long \raw s\tring that goes over mul\tiple lines and contains "\backslashes" okay? '''
I'm not sure if you are complementing me on being a genius, or putting the average developer down for being even more dimwitted than me :-)
But once you bring f strings into the picture, then it DOES make a very large semantic difference. f'{spam}' '{eggs}' is very different depending on whether that is semantically the same as: - concat '{spam}' and '{eggs}', then format - format spam alone, then concat '{eggs}' We can't just say that when the concatenation actually occurs is an optimization, as we can with raw and cooked string literals, because the f string is not a literal, it's actually a function call in disguise. So we have to pick one or the other (or refuse to guess and raise a syntax error). You're right that it doesn't have to occur at compile time. (Although that has been the case all the way back to at least Python 1.5.) But it is a syntactic feature: "Note that this feature is defined at the syntactical level, but implemented at compile time. The ‘+’ operator must be used to concatenate string expressions at run time." https://docs.python.org/3/reference/lexical_analysis.html#string-literal-con... which suggests to me that *semantically* it should occur as early as possible, before the format() operation. That is, it should be equivalent to: - concat '{spam}' and '{eggs}', then format and not format followed by concat. You mentioned the principle of least surprise. I think it would be very surprising to have implicit concatenation behave *as if* it were occurring after the format, which is what you get if you escape the {{eggs}}. But YMMV. If we (the community) cannot reach consensus, perhaps the safest thing would be to just refuse to guess and raise an error on implicit concat of f and non-f strings. -- Steve
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2015-07-23 15:22, Steven D'Aprano wrote:
To me, implicit concatenation is just concatenation that binds more tightly, so: 'a' 'b' is: ('a' + 'b') It can be optimised to 'ab' at compile-time.
To me: f'{spam}' '{eggs}' is: (f'{spam}' + '{eggs}') just as: r'\a' '\\' is: (r'\a' + '\\') [snip]
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 07/23/2015 10:22 AM, Steven D'Aprano wrote:
I think this should be... => f'{spam}{{eggs}}' The advantage that has is you could call it's format method manually again to set the eggs name in a different context. It would also work as expected in the case the second stirng is a f-string. '{spam}' f'{eggs}' f'{{spam}}{eggs}' So if any part of an implicitly concatenated string is an f-string, then the whole becomes an f-string, and the parts that were not have their braces escaped. The part that bothers me is it seems like the "f" should be a unary operator rather than a string prefix. As a prefix: s = f'{spam}{{eggs}}' # spam s2 = s.format(eggs=eggs) # eggs As an unary operator: s = ? '{spam}{{eggs}}' # spam s2 = ? s # eggs (? == some to be determined symbol) They are just normal strings in the second case. Cheers, Ron
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Fri, Jul 24, 2015 at 3:52 AM, Ron Adam <ron3200@gmail.com> wrote:
Except that they can't be normal strings, because the compiler has to parse them. They're expressions. You can't take input from a user and f-string it (short of using exec/eval, of course); it has to be there in the source code. ChrisA
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/23/2015 1:57 PM, Chris Angelico wrote:
Right. This is the "unevaluated f-string" of which I spoke in https://mail.python.org/pipermail/python-ideas/2015-July/034728.html It would be a huge code injection opportunity, and I agree it's best we don't implement it. Eric.
data:image/s3,"s3://crabby-images/f576b/f576b43f4d61067f7f8aeb439fbe2fadf3a357c6" alt=""
Greg Ewing <greg.ewing@canterbury.ac.nz> writes:
The existing behaviour of implicit concatenation doesn't give much of a guide here, unfortunately:: >>> 'foo\abar' r'lorem\tipsum' 'wibble\bwobble' 'foo\x07barlorem\\tipsumwibble\x08wobble' >>> type(b'abc' 'def' b'ghi') File "<stdin>", line 1 SyntaxError: cannot mix bytes and nonbytes literals So, the ‘b’ prefix expects to apply to all the implicitly-concatenated parts (and fails if they're not all bytes strings); the ‘r’ prefix expects to apply only to the one fragment, leaving others alone. Is the proposed ‘f’ prefix, on a fragment in implicit concatenation, meant to have behaviour analogous to the ‘r’ prefix or the ‘b’ prefix, or something else? What's the argument in favour of that choice? -- \ “If we ruin the Earth, there is no place else to go. This is | `\ not a disposable world, and we are not yet able to re-engineer | _o__) other planets.” —Carl Sagan, _Cosmos_, 1980 | Ben Finney
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Fri, Jul 24, 2015 at 10:27 PM, Ben Finney <ben+python@benfinney.id.au> wrote:
It *must* work like r'' does. Implicit concatenation must be thought of as letting each string do its thing and then concatenating using '+', just optimized if possible. The error for b'' comes out because the '+' refuses b'' + ''. I find it a sign of the times that even this simple argument goes on and on forever. Please stop the thread until Eric has had the time to write up a PEP. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/f576b/f576b43f4d61067f7f8aeb439fbe2fadf3a357c6" alt=""
Guido van Rossum <guido@python.org> writes:
That makes sense, and is nicely consistent (‘f’, ‘r’, and ‘b’ all apply only to the one fragment, and then concatenation rules apply). Thanks.
I found this discussion helpful in knowing the intent, and what people's existing expectations are. Hopefully you found it helpful too, Eric! In either case, I look forward to your PEP. -- \ “… one of the main causes of the fall of the Roman Empire was | `\ that, lacking zero, they had no way to indicate successful | _o__) termination of their C programs.” —Robert Firth | Ben Finney
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/25/2015 3:55 AM, Ben Finney wrote:
Yes, I think that's the only interpretation that makes sense.
In trying to understand the issues for a PEP, I'm working on a sample implementation. There, I've just disallowed concatentation entirely. Compared to all of the other issues, it's really insignificant. I'll put it back at some point. Eric.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/25/2015 3:55 PM, Eric V. Smith wrote:
I'm basically done with my implementation of f-strings. I really can't decide if I want to allow adjacent f-string concatenation or not. I'm leaning towards not. I don't like mixing compile-time concatenation with run-time expression evaluation. But my mind is not made up. One issue that has cropped up: Should we support !s and !r, like str.format does? It's not really needed, since with f-strings you can just call str or repr yourself:
Do we also need to support:
f'{"foo"!r}' "'foo'"
With str.format, !s and !r are needed because you can't put the call to repr in str.format's very limited expression syntax. But since f-strings support arbitrary expressions, it's not needed. Still, I'm leaning toward including it for two reasons: it's concise, and there's no reason to be arbitrarily incompatible with str.format. If I include !s and !r, then the only way that str.format differs from f-string expressions is in non-numeric subscripting (unfortunate, but discussed previously and I think required). This ignores the fact that f-string expressions encompass all Python expressions, while str.format is extremely limited. I'll start working on the PEP shortly. Eric.
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
+1 for !r/!s and not being arbitrarily incompatible with existing formatting. (I also really like being able to align string literals using an f-string. That seems to come up all the time in my shorter scripts for headings etc.) Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Eric V. Smith<mailto:eric@trueblade.com> Sent: 8/1/2015 10:44 To: python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On 7/25/2015 3:55 PM, Eric V. Smith wrote:
I'm basically done with my implementation of f-strings. I really can't decide if I want to allow adjacent f-string concatenation or not. I'm leaning towards not. I don't like mixing compile-time concatenation with run-time expression evaluation. But my mind is not made up. One issue that has cropped up: Should we support !s and !r, like str.format does? It's not really needed, since with f-strings you can just call str or repr yourself:
Do we also need to support:
f'{"foo"!r}' "'foo'"
With str.format, !s and !r are needed because you can't put the call to repr in str.format's very limited expression syntax. But since f-strings support arbitrary expressions, it's not needed. Still, I'm leaning toward including it for two reasons: it's concise, and there's no reason to be arbitrarily incompatible with str.format. If I include !s and !r, then the only way that str.format differs from f-string expressions is in non-numeric subscripting (unfortunate, but discussed previously and I think required). This ignores the fact that f-string expressions encompass all Python expressions, while str.format is extremely limited. I'll start working on the PEP shortly. Eric. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sat, Aug 01, 2015 at 01:43:49PM -0400, Eric V. Smith wrote:
There's no harm in allowing implicit concatenation between f-strings. Possible confusion only creeps in when you allow implicit concatenation between f- and non-f-strings.
Wait, did I miss something? Does this mean that f-strings will essentially be syntactic sugar for str(eval(s))? f"[i**2 for i in sequence]" f = lambda s: str(eval(s)) f("[i**2 for i in sequence]") -- Steve
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 8/1/2015 2:25 PM, Steven D'Aprano wrote:
On Sat, Aug 01, 2015 at 01:43:49PM -0400, Eric V. Smith wrote:
Well, it's somewhat more complex. It's true that:
But it's more complex when there are format specifiers and literals involved. Basically, the idea is that: f'a{expr1:spec1}b{expr2:spec2}c' is shorthand for: ''.join(['a', expr1.__format__(spec1), 'b', expr2.__format__(spec2), 'c']) The expressions can indeed be arbitrarily complex expressions. Because only string literals are supported, it just the same as if you'd written the expressions not inside of a string (as shown above). It's not like you're eval-ing user supplied strings. Eric.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 8/2/2015 7:46 PM, Mike Miller wrote:
Hi,
I don't understand how we got to arbitrary expressions.
I think here: https://mail.python.org/pipermail/python-ideas/2015-July/034701.html
There was probably an edge case or two, but I wasn't expecting str(eval(s)) to be the answer, and one I'm not sure I'd want.
As I pointed out earlier, it's not exactly str(eval(s)). Also, what's your concern with the suggested approach? There are no security concerns as there would be with eval-ing arbitrary strings. Eric.
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
On 08/02/2015 07:43 PM, Eric V. Smith wrote:
In that message, GvR seems to be exploring the options. I could be wrong, but from reading again, he appears to favor keeping it to .format() syntax?
There was probably an edge case or two, but I wasn't expecting
Did anyone discover the strategy below wasn't possible (moving the identifiers)? f'{x}{y}' --> '{}{}'.format(x, y) f'{x:10}{y[name]}' --> '{:10}{[name]}'.format(x, y)
Also, what'syour concern with the suggested approach? There are no security concerns as there would be with eval-ing arbitrary strings.
That's true I suppose, and perhaps I'm being irrational but it feels like a complex solution, a pandora's box if you will. To put it another way, it's way more power than I was expecting. It's rare that I use even the advanced features of .format() as it is. I'm guessing that despite the magic happening behind the scenes, people will still think of the format string as an (interpolated) string, like the shell. If they want to write arbitrary expressions they can already do that in python and then format a string with the answers. This will be another way to write code, that's (as far as I know) not strictly necessary. Also I thought, that the simpler the concept, the greater likelihood of PEP acceptance. Binding the format string to .format() does that, in the mind at least, if not the implementation. Still, if this is what most people want, I'll keep quiet from now on. ;) (Thanks for taking this on, btw.) -Mike
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Aug 02, 2015 at 10:43:03PM -0400, Eric V. Smith wrote:
Language features should be no more powerful than they need to be. It isn't just *security* that we should be concerned about, its also about readability, learnability, the likelihood of abuse by writing unmaintainable Perlish one-liners, and the general increase in complexity. Or to put it another way... YAGNI. We started of with a fairly simple and straightforward feature request: to make it easy to substitute named variables in format strings. We ought to be somewhat cautious about accepting even that limited version. After all, hundreds of languages don't have such a feature, and Python worked perfectly well without it for over 20 years. This doesn't add anything to the language that cannot already be done with % and str.format(). But suddenly we've gone from a feature request that has been routinely denied many times in the past (having variables be automatically substituted into strings), to full-blown evaluation of arbitrarily complex expressions being discussed as if it were a done-deal. I've heard of the trick of asking for a pony if you actually want a puppy, but this is the first time I've seen somebody ask for a puppy and be given a thoroughbred. Anyway, there's no harm done, since this is going through the PEP process. It just strikes me as so unlike the usual conservatism, particularly when it comes to syntax changes, that it surprised me. Perhaps somebody slipped something in the water? :-) -- Steve
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 08/02/2015 10:43 PM, Eric V. Smith wrote:
Actually, a better link is: https://mail.python.org/pipermail/python-ideas/2015-July/034729.html where I discuss the pros and cons of str.format-like expressions, versus full expressions. Plus, Guido's response. I hope to have the first draft of a PEP ready in the next few days. I'll also look at putting my implementation online somewhere. Eric.
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
Hi, In that message there was a logical step that I don't follow:
There was a solution to this that came up early in the discussion, moving the identifier only: f'{x}{y}' --> '{}{}'.format(x, y) f'{x:10}{y[name]}' --> '{:10}{[name]}'.format(x, y) I missed the part where this was rejected. As far as I can tell from your message, it is because it would be hard to parse? But, it seems no harder than other solutions. I've whipped up a simple implementation below. Also, Guido sounds supportive of your general process, but to my knowledge has not explicitly called for arbitrary expressions to be included. Perhaps he could do that, or encourage us to find a more conservative solution? Sorry to be a pain, but I think this part is important to get right. -Mike Simple script to illustrate (just ascii, only one format op supported). TL;DR: the idea is to grab the identifier portion by examining the class of each character, then move it over to a .format function call argument. import string idchars = string.ascii_letters + string.digits + '_' # + unicode letters capture = None isid = None fstring = '{a[2:3]:20d}' #~ fstring = '{a.foo}' identifier = [] fmt_spec = [] for char in fstring: print(char + ', ', end='') if char == '{': print('start_capture ', end='') capture = True isid = True elif char == '}': print('end_capture') capture = False break else: if capture: if (char in idchars) and isid: identifier.append(char) else: isid = False fmt_spec.append(char) identifier = ''.join(identifier) fmt_spec = ''.join(fmt_spec) print() print('identifier:', repr(identifier)) print('fmt_spec: ', repr(fmt_spec)) print('result: ', "'{%s}'.format(%s)" % (fmt_spec, identifier)) And the results: >python3 fstr.py {, start_capture a, [, 2, :, 3, ], :, 2, 0, d, }, end_capture identifier: 'a' fmt_spec: '[2:3]:20d' result: '{[2:3]:20d}'.format(a) On 08/04/2015 11:32 AM, Eric V. Smith wrote:
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 8/4/2015 4:05 PM, Mike Miller wrote:
It's rejected because .format treats: '{:10}{[name]}'.format(x, y) --> format(x, '10') + format(y['name']) and we (for some definition of "we") would like: f'{x:10}{y[name]}' --> format(x, '10') + format(y[name]) It's the change from y[name] to y['name'] that Guido rejected for f-strings. And I agree: it's unfortunate that str.format works this way. It would have been better just to say that the subscripted value must be a literal number for str.format, but it's too late for that. It's not hard to parse either way. All of the machinery exists to use either the str.format approach, or the full expression approach.
True, he hasn't definitively stated his approval for arbitrary expressions. I think it logically follows from our discussions. But if he'd like to rule on it one way or the other before I'm done with the PEP draft, that's fine with me. Or, we can just wait for the PEP. Personally, now that I have a working implementation that I've been using, I have to say that full expressions are pretty handy. And while I agree you don't want to be putting hyper-complicated dict comprehensions with lots of function calls into an f-string, the same can be said of many places we allow expressions.
Sorry to be a pain, but I think this part is important to get right.
No problem. It's all part of the discussion. Eric.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
OK, fine, I'll say right now that I agree with Eric's arguments for full expressions. (Though honestly the whole look of f-strings hasn't quite grown on me. I wish I could turn back the clock and make expression substitution a feature of all string literals, perhaps using \{...}, which IIRC I've seen in some other language.) On Tue, Aug 4, 2015 at 10:20 PM, Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 8/4/2015 4:52 PM, Guido van Rossum wrote:
OK, fine, I'll say right now that I agree with Eric's arguments for full expressions.
Thanks.
Well, we could do that with a future statement. It might be tough to ever make it the default, though. But since it would only be literals, it's easy enough to find. Eric.
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
On 08/04/2015 01:20 PM, Eric V. Smith wrote:
It's rejected because .format treats: '{:10}{[name]}'.format(x, y) --> format(x, '10') + format(y['name'])
Isn't this what already happens? Seems odd to go in a different direction just avoid an implementation that already exists, even though it may not be perfect. Perhaps it's time to deprecate the troublesome syntax? Fortunately there's plenty of time before the next version of python to figure this out. -Mike
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
Sorry to reply to myself... I'm hoping we could consider a .format()-only implementation as Plan B, alongside your Plan A with arbitrary expressions. -Mike
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I can only promise that will be considered if you write up a full proposal and specification, to compete with Eric's PEP. (I won't go as far as requiring you to provide an implementation, like Eric.) On Wed, Aug 5, 2015 at 1:29 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 08/04/2015 07:05 PM, Mike Miller wrote:
Since "f" strings don't exist yet, they could be handled with a different method. '{x:10}{y[name]}'.__fmt__(x=x, y=y, name=name) The string isn't altered here, which may help with error messages, and all names are supplied as keywords explicitly. But is there a migration path that would work? Cheers, Ron
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Aug 02, 2015 at 10:43:03PM -0400, Eric V. Smith wrote:
This comment has been sitting at the back of my mind for days, and I suddenly realised why. That's not correct, there are security concerns. They're not entirely new concerns, but the new syntax makes it easier to fall into the security hole. Here's an example of shell injection in PHP: <?php print("Please specify the name of the file to delete"); print("<p>"); $file=$_GET['filename']; system("rm $file"); ?> https://www.owasp.org/index.php/Command_Injection With the new syntax, Python's example will be: os.system(f"rm {file}") or even os.system("rm \{file}") if Eric's second proposal goes ahead. Similarly for SQL injection and other command injection attacks. It is true that the same issues can occur today, for example: os.system("rm %s" % file) but it's easier to see the possibility of an injection with an explicit interpolation operator than the proposed implicit one. We can teach people to avoid the risk of command injection attacks by avoiding interpolation, but the proposed syntax makes it easier to use interpolation without noticing. Especially with the proposed \{} syntax, any string literal could do runtime interpolation, and the only way to know whether it does or not is to inspect the entire string carefully. Passing a literal is no longer safe, as string literals will no longer just be literals, they will be runtime expressions. Bottom line: the new syntax will make it easier for command injection to remain unnoticed. Convenience cuts both ways. Making the use of string interpolation easier also makes the *misuse* of string interpolation easier. -- Steve
data:image/s3,"s3://crabby-images/efe4b/efe4bed0c2a0c378057d3a32de1b9bcc193bea5e" alt=""
On 08/06/2015 05:18 AM, Steven D'Aprano wrote:
Is it? Why? To me, the problem of injection is completely orthogonal to how exactly the string interpolation is performed. Also, there's nothing "implicit" about the new syntax. It does not magically interpolate where it feels like, or coerce objects to strings. It interpolates wherever you - explicitly - put the new syntax. cheers, Georg
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 6 August 2015 at 13:18, Steven D'Aprano <steve@pearwood.info> wrote:
We actually aim to teach folks to avoid shell injection attacks by avoiding the shell: https://docs.python.org/3/library/subprocess.html#security-considerations If you invoke the shell in any kind of networked application, it's inevitable that you're eventually going to let a shell injection attack through (at which point you better hope you have something like SELinux or AppArmor configured to protect your system from your mistake). That said, this is also why I'm a fan of eventually allowing syntax like: !sh("sort $file > uniq > wc -l") !sql("select $col from $table") !html("<html><body>$body</body></html>") that eventually adapts whatever interpolation syntax we decide on here for format strings to other operations like shell commands and SQL queries. The more time I spend dealing with the practical realities of writing commercial software, the more convinced I became that the right way to do something and the easiest way to do something have to be the same way if we seriously expect people to consistently get it right (and yes, the PEP 466 & 476 discussions had a significant role to play in that change of heart, as did the Unicode changes between Python 2 & 3). When the current easiest way is wrong, the only way to reliably get people to do it right in the future is to provide an even easier way that automatically does the right thing by default (this also helps act as a forcing function that encourages folks to learn "how to do it right" in older versions, even if the new feature itself isn't available there). It's not a panacea (bad habits are hard to unlearn), but we can at least try to help stop particularly pernicious problems getting worse. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 7 August 2015 at 00:15, Ron Adam <ron3200@gmail.com> wrote:
It's already there in my view: $ python -m this | grep 'obvious way' There should be one-- and preferably only one --obvious way to do it. When a particular approach is both easy and right, it rapidly becomes the obvious choice. Issues arise when the right way is harder than the wrong way, since the apparently obvious way is a bad idea, but the superior alternative isn't as clearly applicable to the problem. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 8/1/2015 1:43 PM, Eric V. Smith wrote:
Here's another issue. I can't imagine this will happen often, but it should be addressed. It has to do with literal expressions that begin with a left brace. For example, this expression:
{x: y for x, y in [(1, 2), (3, 4)]} {1: 2, 3: 4}
If you want to put it in an f-string, you'd naively write:
f'expr={{x: y for x, y in [(1, 2), (3, 4)]}}' 'expr={x: y for x, y in [(1, 2), (3, 4)]}'
But as you see, this won't work because the doubled '{' and '}' chars are just interpreted as escaped braces, and the result is an uninterpreted string literal, with the doubled braces replaced by undoubled ones. There's currently no way around this. You could try putting a space between the left braces, but that fails with IndentationError:
In the PEP I'm going to specify that leading spaces are skipped in an expression. So that last example will now work:
f'expr={ {x: y for x, y in [(1, 2), (3, 4)]}}' 'expr={1: 2, 3: 4}'
Note that the right braces in that last example aren't interpreted as a doubled '}'. That's because the first one is part of the expression, and the second one ends the expression. The only time doubling braces matters is inside the string literal portion of an f-string. I'll reflect this "skip leading white space" decision in the PEP. Eric.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
Eric. On 8/2/2015 11:57 AM, MRAB wrote:
Good question. I'm parsing it with PyParser_ASTFromString. Maybe I'm missing a compiler flag there which will ignore leading spaces. But in any event, the result is the same: You'll need to add a space here in order to disambiguate it from doubled braces. That's really the crux of the issue. Eric.
data:image/s3,"s3://crabby-images/4c94f/4c94fef82b11b5a49dabd4c0228ddf483e1fc69f" alt=""
On 02/08/2015 20:12, Xavier Combelle wrote:
You could disambiguate with parenthesis like this f'expr={({x: y for x, y in [(1, 2), (3, 4)]})}'
What on earth happened to "Readability counts"? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 08/02/2015 11:37 AM, Eric V. Smith wrote:
On 8/1/2015 1:43 PM, Eric V. Smith wrote:
On 7/25/2015 3:55 PM, Eric V. Smith wrote:
This probably doesn't work either... f'expr={{{x: y for x, y in [(1, 2), (3, 4)]}}}' Escaping "{{{" needs to resolve for left to right to work. Which is weird.
Could two new escape characters be added to python strings? "\{" and "\}" f'expr={\{x: y for x, y in [(1, 2), (3, 4)]\}}' Ron
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Aug 3, 2015 at 1:37 AM, Eric V. Smith <eric@trueblade.com> wrote:
Sounds good. And even though your }} is perfectly valid, I'd recommend people use spaces at both ends:
f'expr={ {x: y for x, y in [(1, 2), (3, 4)]} }' 'expr={1: 2, 3: 4}'
which presumably would be valid too. It's a narrow enough case (expressions beginning or ending with a brace) that the extra spaces won't be a big deal IMO. ChrisA
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Fri, Jul 24, 2015 at 11:40:49AM +1200, Greg Ewing wrote:
As I stated before, I think that should at least raise a warning, if not a syntax error. I think we're in uncharted territory here, because f strings aren't really a literal string, they're actually a runtime function call, and we haven't got much in the way of prior-art for implicit concatenation of a literal with a function call. So we ought to be cautious when dealing with anything the least bit ambiguous, and avoid baking in a mistake that we can't easily change. There's no ambiguity with concat'ing f strings only, or non-f strings only, but the more we discuss this, the more inclined I am to say that implicit concatenation between f strings and non-f strings *in any order* should be a syntax error.
It would seem very strange to me if the f infected strings *before* it as well as after it.
It's consistent with Python 2: py> "abc" u"ßŮƕΩж" "def" u'abc\xdf\u016e\u0195\u03a9\u0436def' The unicodeness of the middle term turns the entire concatenation into unicode. I think it is informative that Python 3 no longer allows this behaviour: py> b"abc" u"ßŮƕΩж" b"def" File "<stdin>", line 1 SyntaxError: cannot mix bytes and nonbytes literals -- Steve
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Thu, Jul 23, 2015 at 7:22 AM, Steven D'Aprano <steve@pearwood.info> wrote:
If I had a dollar for everytime somebody on the Internet misused "strawman argument", I would be a rich man.
You wouldn't get a dollar here. If you want to be strict, a strawman argument is misrepresenting an opponent's viewpoint to make it easier to refute but it also applies to similar arguments. You stated that "constant folding ... *would* change the semantics" *[emphasis added]*. It's not a fact that constant folding must change the semantics as is easily shown. And in fact, by definition, constant folding should never change semantics. So the straw here is imagining that the implementer of this feature would ignore the accepted rules regarding constant folding and then criticizing the implementer for doing that.
I agree with that.
Yes (3) The hard case, when you mix f and non-f strings.
You talk about which happens "first" so let's recast this as an operator precedence question. Think of f as a unary operator. Does f bind tighter than implicit concatenation? Well, all other string operators like this bind more tightly than concatenation. f'{spam}' '{eggs}'
Implicit concatenation does NOT happen as early as possible in every case. When I write: r'a\n' 'b\n' ==> 'a\\nb\n' the r is applied to the first string *before* the concatenation with the second string.
Imagine that we have another prefix that escapes strings for regex. That is e'a+b' ==> 'a\\+b'. This is another function call in disguise, just calling re.escape. Applying your reasoning could have us conclude that e is just like f and should infect all the other strings it is concatenated with. But that would actually break the reason to have this in the first place, writing strings like this: '(' e'1+2' '|' e'1*2' '){1,2}' Perhaps you're thinking that e should be done at compile time. Well, when I combine it with f, it clearly must be done at run-time: '(' ef'{foo}' '|' ef'{bar}' '){1,2}' I'm not actually proposing an e prefix. I'm just speculating how it would work if we had one. And combining e and f must mean do f then e because the other order is useless, just as combining f and r must mean do r then f. Maybe you can't say that concatenation is an optimization but I can (new text underlined): Multiple adjacent string or bytes literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same as their concatenation. ... Thus, "hello" 'world' is equivalent to "helloworld". This feature can be used to reduce the number of backslashes needed, to split long strings conveniently across long lines, *to mix formatted and unformatted strings,* or even to add comments to parts of strings, for example: re.compile("[A-Za-z_]" # letter or underscore "[A-Za-z0-9_]*" # letter, digit or underscore ) Note that this feature is defined at the syntactical level, but implemented at compile time *as an optimization*. The ‘+’ operator must be used to concatenate string expressions at run time. Also note that literal concatenation can use different quoting styles for each component (even mixing raw strings and triple quoted strings). *If formatted strings are mixed with unformatted strings, they are concatenated at compile time and the unformatted parts are escaped so they will not be subject to format substitutions.* --- Bruce
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Fri, Jul 24, 2015 at 11:57 AM, Bruce Leban <bruce@leban.us> wrote:
Thing is, though, it isn't an operator, any more than list display is an operator. Operators take values and result in values. You can break out some part of an expression and it'll have the same result (apart from short-circuit evaluation). With f"...", it's a piece of special syntax, not something you apply to a string. You can't do this: fmt = "Hello, {place}!" place = "world" print(f fmt) If f were an operator, with precedence, then this would work. But it doesn't, for the same reason that this doesn't work: path = "C:\users\nobody" fixed_path = r path These are pieces of syntax, and syntax is at a level prior to all considerations of operator precedence. ChrisA
data:image/s3,"s3://crabby-images/291c0/291c0867ef7713a6edb609517b347604a575bf5e" alt=""
On 24.07.2015 04:16, Chris Angelico wrote:
You might be true about this. I think he just used operators as some sort of analogy to figure out which comes first: concat or format. My semantic opinion on this: first format, then concat. Why? Because '...' is a atomic thing and shouldn't be modified by its peer elements (i.e. strings). About implementation: the idea of first concat with **implicit** escaping braces illustrated another minor use case for me: no need to escape braces. f'Let {var} = ''{x | x > 3}' This way, the f syntax would really help readability when it comes to situations where many braces are used.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Fri, Jul 24, 2015 at 07:02:49PM +0200, Sven R. Kunze wrote:
Implicit concatenation is a lexical feature, not a runtime feature. Every other case of implicit concatenation in Python occurs at compile- time, and has since Python 1.5 if not earlier. This would be an exception, and would occur after the function call. That's surprising and inconsistent with all the other examples of implicit concatenation. 'aaa' 'bbb'.upper() returns 'AAABBB', not 'aaaBBB'.
Sorry, I find that really hard to parse without a space between the two fragments, so let me add a space: f'Let {var} = ' '{x | x > 3}' That's better :-) I completely understand the appeal of your point of view. But it feels wrong to me, I think that it mixes up syntactical features and runtime features inappropriately. If we write f'{spam}' that's syntactic sugar for a call to the format method: '{spam}'.format(***) where the *** stands in for some sort of ChainMap of locals, nonlocals, globals and built-ins, purely for brevity, I'm not implying that should be new syntax. Since *in all other cases* implicit concatenation occurs before runtime method or function calls: f'{spam}' '{eggs}' should be seen as: # option (1) '{spam}' '{eggs}' . format(***) not # option (2a) '{spam}' . format(***) + '{eggs}' I'm not implying that the *implementation* must involve an explicit concat after the format. It might, or it might optimize the format string by escaping the braces and concat'ing first: # option (2b) '{spam}{{eggs}}' . format(***) Apart from side-effects like time and memory, options (2a) and (2b) are equivalent, so I'll just call it "option (2)" and leave the implementation unspecified. I think that option (2) is surprising and inconsistent with all other examples of implicit concatenation in Python. I think that *predictability* is a powerful virtue in programming languages, special cases should be avoided if possible. Option (1) follows from two facts: - implicit concatenation occurs as early as possible (it is a lexical feature, so it can occur at compile-time, or as close to compile-time as possible); - f strings are syntactic sugar for a call to format() which must be delayed to run-time, as late as possible. These two facts alone allow the programmer to reason that f'{spam}' '{eggs}' must be analogous to the case of 'aaa' 'bbb'.upper() above. Option (2) requires at least one of the two special cases: - implicit concatenation occurs as early as possible, unless one of the strings is a f string, in which case it occurs... when exactly? - literal strings like '{eggs}' always stand for themselves, i.e. what you see is what you get, except when implicitly concatenated to f strings, where they are magically escaped. We already have at least two other ways to get the same result that option (2) gives: f'{spam}' + '{eggs}' # unambiguously format() first, concat second f'{spam}{{eggs}}' # unambiguously escaped braces Giving implicit concatenation a special case just for convenience sake would, in my opinion, make Python just a little more surprising for little real benefit. -- Steve
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
TL;DR: Please let's just ban implicit concatenation between f strings (a runtime function call) and non-f strings. The user should be explicit in what they want, using either explicitly escaped braces or the + operator. Anything else is going to be surprising. On Thu, Jul 23, 2015 at 06:57:25PM -0700, Bruce Leban wrote:
Are you saying that any good faith disagreement about people's position is a strawman? If not, I don't understand what you mean by "similar arguments". A strawman argument is explicitly a bad-faith argument. Describing my argument as a strawman implies bad faith on my part. I don't mind if you think my argument is wrong, mistaken or even incoherent, but it is not made in bad faith and you should imply that it is without good reason. Moving on to the feature:
You stated that "constant folding ... *would* change the semantics" *[emphasis added]*.
In context, I said that constant-folding the *explicit* + concatenation of f'{a}' + '{b}' to f'{a}{b}' would change the semantics. I'm sorry if it was not clear enough that I specifically meant that. I thought that the context was enough to show what I meant. By constant-folding, I mean when the parser/lexer/compiler/whatever (I really don't care which) folds expressions like the following: 'a' + 'b' to this: 'ab' If the parser/whatever does that to mixed f and non-f strings, I think that would be harmful, because it would change the semantics: f'{a}' + '{b}' executed at runtime with no constant-folding is not equivalent to the folded version: f'{a}{b}' Hence, the peephole optimizer should not do that. I hoped that wouldn't be controversial. [...]
I'm taken aback that you seem to think my pointing out the above is a criticism of an implementer who doesn't even exist yet! We're still discussing what the semantics of f strings should be, and I don't think anyone should be offended or threatened by me being explicit about what the behaviour should be. And for the record, it is not unheard of for constant-folding peephole optimizers to accidentally, or deliberately, change the sematics of code. For example, in D constant-folded 0.1 + 0.2 is not the same as 0.1 + 0.2 done at runtime (constant folding is done at single precision instead of double precision): http://stackoverflow.com/questions/6874357/why-0-1-0-2-0-3-in-d This paper discusses the many pitfalls of optimizing floating point code, and mentions that C may change the value of literal expressions depending on whether they are done at runtime or not: Another effect of this pragma is to change how much the compiler can evaluate at compile time regarding constant initialisations. [...] If it is set to OFF, the compiler can evaluate floating-point constants at compile time, whereas if they had been evaluated at runtime, they would have resulted in different values (because of different rounding modes) or floating-point exception. http://arxiv.org/pdf/cs/0701192.pdf Constant-folding *shouldn't* change the semantics of code, but programmers are only human. They make bad design decisions or write buggy code the same as all of us.
I don't think this is correct. Can you give an example? All the examples I can come up with show implicit concatenation binding more tightly (i.e. it occurs first), e.g.: py> 'a' 'ba'.replace('a', 'z') 'zbz' not 'abz'. And of course, you can't implicitly concat to a method call: py> 'a'.replace('a', 'z') 'ba' File "<stdin>", line 1 'a'.replace('a', 'z') 'ba' ^ SyntaxError: invalid syntax So I think it would be completely unprecedented if the f pseudo-operator bound more tightly than the implicit concatenation.
r isn't a function, it's syntax. There's nothing to apply. This is why I don't think that the behaviour of mixed raw and cooked strings is a good model for mixing f and non-f strings. Both raw and cooked strings are lexical features and should be read from left to right, in the order that they occur, not function calls which must be delayed until runtime. [...]
Now you're the one confusing interface with implementation :-) Such an e string need not be a function call, it could be a lexical feature like raw strings. In fact, I would expect that they should be. These hypothetical e strings could be a lexical feature, or a runtime function, but f *must* be a runtime function since the variables being interpolated don't have values to interpolate until runtime. We have no choice in the manner, whereas we do have a choice with e strings. In any case, I don't think it is a productive use of our time to discuss a hypothetical e string that neither of us intend to propose.
I don't think that flies. It's *not just an optimization* when it comes to f strings. It makes a difference to the semantics. f'{spam}' '{eggs}' being turned into "format first, then concat" has a very different meaning to "concat first, then format". To get the semantics you want, you need a third option: escape first, then concat, then format But there's nothing obvious in the syntax '{eggs}' that tells anyone when it will be escaped and when it won't be. You need to be aware of the special case "when implicitly concat'ed to f strings, BUT NO OTHER TIME, braces in ordinary strings will be escaped". I dislike special cases. They increase the number of things to memorise and lead to surprises.
That's your opinion for the desirable behaviour. I don't like it, I don't expect it. The fact that you have to explicitly document it shows that it is a special case that doesn't follow from the existing behaviour of Python's implicit concatenation rules. I don't think we should have such a special case, when there are already at least two other ways to get the same effect. But since my preferred suggestion is unpopular, I'd much rather just ban implicit concat'ing of f and non-f strings and avoid the whole argument. That's not an onerous burden on the coder: result = (f'{this}' + '{that}') is not that much more difficult to type than: result = (f'{this}' '{that}') and it makes the behaviour clear. -- Steve
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Mon, Jul 20, 2015 at 7:57 PM, Eric V. Smith <eric@trueblade.com> wrote:
Oh, I forgot that.
I was more thinking of translating that specific example to 'x:{} y:{}'.format(a.x, y) which avoids some of the issues your example is trying to clarify. It would still probably be best to limit the syntax inside {} to exactly what regular .format() supports, to avoid confusing users. Though the consistency argument can be played both ways -- supporting absolutely anything that is a valid expression would be more consistent with other places where expressions occur. E.g. in principle we could support operators and function calls here.
I guess that would mean the former restriction. I think it's fine.
No; I really want to avoid having to use globals() or locals() here. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/50535/5053512c679a1bec3b1143c853c1feacdabaee83" alt=""
On Jul 19, 2015, at 04:35 PM, Mike Miller wrote:
You might take a look at a feature of flufl.i18n, which supports automatic substitutions from locals and globals: http://flufli18n.readthedocs.org/en/latest/docs/using.html#substitutions-and... In general, flufl.i18n builds on PEP 292 $-strings and gettext to support more i18n use cases, especially in multi-language contexts. Still, the substitution features can be more or less used independently, and do a lot of what you're looking for. This feature is only supported on implementations with sys._getframe() though (e.g. CPython). Cheers, -Barry
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Mike Miller schrieb am 20.07.2015 um 01:35:
Is this an actual use case that people *commonly* run into? I understand that the implicit name lookups here are safe and all that, but I cannot recall ever actually using locals() for string formatting. The above looks magical to me. It's completely unclear that string interpolation is happening behind my back here, unless I already know it. I think it's ok to have a "b" string prefix produce a special kind of string and expect people to guess that and look up what it does if they don't know (and syntax like a string prefix is difficult enough to look up already). Having an "f" prefix interpolate the string with names from the current namespace is way beyond what I would expect a string prefix to do. I'd prefer not seeing a "cool feature" added just "because it's cool". If it additionally is magic, it's usually not a good idea. Stefan
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
On 08/08/2015 09:49 AM, Stefan Behnel wrote:
There are several ways to accomplish that line. If you look below it there two alternatives, that are suboptimal as well.
Direct string interpolation is a widely desired feature, something the neckbeards of old, hipsters, and now suits have all agreed on. Since Python uses both " and ' for strings, there isn't an obvious way to separate normal strings from interpolated ones like shell languages do. That leaves, 1. interpolating all strings, or instead 2. marking those we want interpolated. Marking them appears to be the more popular solution here, the last detail is whether it should be f'', i'', or $'', etc. Letters are easier to read perhaps. The implementation is straightforward also. Since the feature will take about 30 seconds to learn, and pay back with a billion keystrokes saved, I'd argue it's a good tradeoff. -Mike
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Mike Miller schrieb am 09.08.2015 um 02:48:
But how common is it, really? Almost all of the string formatting that I've used lately is either for logging (no help from this proposal here) or requires some kind of translation/i18n *before* the formatting, which is not helped by this proposal either. Meaning, in almost all cases, the formatting will use some more or less simple variant of this pattern: result = process("string with {a} and {b}").format(a=1, b=2) which commonly collapses into result = translate("string with {a} and {b}", a=1, b=2) by wrapping the concrete use cases in appropriate helper functions. I've seen Nick Coghlan's proposal for an implementation backed by a global function, which would at least catch some of these use cases. But it otherwise seems to me that this is a huge sledge hammer solution for a niche problem. Stefan
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Aug 9, 2015, at 01:00, Stefan Behnel <stefan_ml@behnel.de> wrote:
There's also text-based protocols, file formats, etc. I use string formatting quite a bit for those, and this proposal would help there. Also, do you really never need formatting in log messages? Do you only use highly structured log formats? I'm always debug-logging things like "restored position {}-{}x{}-{} not in current desktop bounds {}x{}" or "file '{}' didn't exist, creating" and so on, and this proposal would help there as well. But you're right that i18n is a bigger problem than it appears. I went back through some of my Swift code, and some of the blog posts others have written about how nifty string interpolation is, and I remembered something really obvious that I'd forgotten: Most user-interface strings in Cocoa[Touch] apps are in the interface-builder objects, or at least explicitly in strings files, not in the source code. (I don't know if this is similar for C# 8's similar feature, since I haven't done much C# since .NET was still called .NET, but I wouldn't be surprised.) So you don't have to i18n source-code strings very often. But when you do, you have to revert to the clunky ObjC way of creating an NSLocalizedString with %@ placeholders and and calling stringWithFormat: on it. In Python, user-interface strings are very often in the source code, so you'd have to revert to the 3.5 style all the time. Which isn't nearly as clunky as the ObjC style, but still... Now I have to go back and reread Nick's posts to see if his translated-and-interpolated string protocol makes sense and would be easy to use for at least GNU gettext and Cocoa (even without his quasi-associated semi-proposal for sort-of-macros), because without that, I'm no longer sure this is a good idea. If the feature helps tremendously for non-i18n user interface strings, but then you have to throw it away to i18n your code, that could just discourage people from writing programs that work outside the US. (I suspect Nick and others already made this argument, and better, so apologies if I'm being slow here.)
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Andrew Barnert via Python-ideas schrieb am 09.08.2015 um 12:27:
Sure, I use formatting there. But the formatting is intentionally done *after* checking that the output passes the current log level. The proposal is about providing a way to format a string literal *before* anyone can do something else with it. So it won't help for logging. Especially not for debug logging. Also, take another look at your examples. They use positional formatting, not named formatting. This proposal requires the use of named formatting and only applies to the exact case where the names or expressions used in the template match the names used for (local/global) variables. As soon as the expressions become non-trivial or the variable names become longer (in order to be descriptive), having to use the same lengthy names and expressions in the template, or at least having to assign them to new local variables before-hand only to make them available for string formatting, will quickly get in the way more than it helps. With .format(), I can (and usually will) just say output = "writing {filename} ...".format( filename=self.build_printable_relative_filename(filename)) rather than having to say printable_filename = self.build_printable_relative_filename(filename) output = f"writing {printable_filename} ..." # magic happening here del printable_filename # not used anywhere else As soon as you leave the cosy little niche where *all* values are prepared ahead of time and stored in beautiful local variables with tersely short and well-chosen names that make your string template as readable as your code, this feature is not for you any more. Stefan
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Aug 9, 2015, at 04:01, Stefan Behnel <stefan_ml@behnel.de> wrote:
Maybe it's just me, but 90%+ of my debug log messages are really only dev log messages for the current cycle/sprint/whatever, and I strip them out before pushing. So as long as they're not getting in the way of performance for the feature I'm working on in the environment I'm working in, there's no reason not to write them as quick&dirty as possible, forcing me to decide which ones will actually be useful in debugging user problems, and clean up and profile them as I do so. And quite often, it turns out that wasting time on a str.format call even when debug logging is turned off really doesn't have any measurable impact anyway, so I may end up using it in the main string or in one of the %s arguments anyway, if it's more readable that way.
Yes, but that's because with simple messages in 3.5, positional formatting is more convenient. Under the proposal, that would change. Compare: "file '{}' didn't exist, creating".format(fname) "file '{fname}' didn't exist, creating".format(fname=fname) "file '{fname}' didn't exist, creating".format(**vars()) f"file '{fname}' didn't exist, creating" The second version has me repeating the name three times, while the third forces me to think about which scope to pass in, and is still more verbose and more error-prone than the first. But the last one doesn't have either of those problems. Hence the attraction. And of course there's nothing forcing you to use it all the time; when it's not appropriate (and it won't always be), str.format is still there.
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Andrew Barnert via Python-ideas schrieb am 09.08.2015 um 14:26:
Yes, I think that's what I dislike most about it. It's only a special purpose feature that forces me to learn two things instead of one in order to use it. Or actually more than two. I have to learn how to use it, I have to understand the limitations and learn to detect when I reach them (especially in terms of code style), and know how to transform my code to make it work again afterwards. One of the obvious quick comments on reddit was this: https://xkcd.com/927/ Stefan
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
On 08/09/2015 04:01 AM, Stefan Behnel wrote:
This discussion reminds me of a debate I had with a co-worker last year. I think I argued "your" side on that one, Stefan. He insisted on writing log lines like this: log.debug('File "{filename}" has {lines} lines.'.format(filename=filenames, lines=lines) #etc He said this form was most readable, because you could ignore the right side. While I said we should log like this, not only because it's shorter, but also because the formatting doesn't happen unless the log level is reached: log.debug('File "%s" has %s lines.', filename, lines) I also argued on performance grounds, but when I tried to prove it in real applications the difference was almost nothing, perhaps because the logger has to check a few things before deciding to format the string. Logging from a tight loop probably would create more overhead, but we've rarely done that. So performance didn't turn out to be a good reason to chose in most cases. In a tight loop you could still use isEnabledFor(level) for example. This experience did inform my original feature request, the result is now shorter, more readable, and the performance hit is negligible: log.debug(f'File "{filename}" has {lines} lines.') Also, my coworker and I would be able to move on to the next argument. ;) Another feature request would be to have logging support .format syntax like it does printf syntax, anyone know why that never happened?
This is where I disagree. Because if I have an important variable that I am using and bothering to log, such as a filename, I undoubtedly am going to use it again soon, to do an operation with it. So, I'll want to keep that variable around to use it again, rather than doing a recalculation. Cheers, -Mike
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 20, 2015 at 9:50 AM, Terry Reedy <tjreedy@udel.edu> wrote:
I expect they are - compare the percent formatting example:
csstext += '%s%s%s{%s' % (nl, key, space, nl)
The double open brace makes for a literal open brace in the end result. It's the same ugliness as trying to craft a regular expression to match Windows path names without raw string literals, so I completely sympathize with the desire for something better. But I don't think f"fmt" is it :) ChrisA
data:image/s3,"s3://crabby-images/22d89/22d89c5ecab2a98313d3033bdfc2cc2777a2e265" alt=""
Hi! On Sun, Jul 19, 2015 at 07:50:52PM -0400, Terry Reedy <tjreedy@udel.edu> wrote:
I'm sure they are. The code is supposed to generate something equivalent to csstext += ''' p.italic { ''' to be extended later with something like csstext += ''' font-style: italic; } '''
-- Terry Jan Reedy
Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.
data:image/s3,"s3://crabby-images/c437d/c437dcdb651291e4422bd662821948cd672a26a3" alt=""
I've followed all the posts in this thread, and although my particular opinion has little significance, I'm definitely -1 on this idea (or actually -1000). To my mind is that we have already gone vastly too far in proliferating near synonyms for templating strings. Right now, I can type:
"My name is %(first)s %(last)s" % (**locals())
Or:
"My name is {first} {last}".format(**locals())
Or:
string.Template("My name is $first $last").substitute(**locals())
And they all mean the same thing, with pretty much the same capabilities. I REALLY don't want a 4th or 5th way to spell the same thing... let alone one with weird semantics with lots of edge cases that are almost impossible to teach. I really DO NOT want to spell the same thing as f"..." or !"...", let alone have every single string magically become a runtime evaluated complex object like "My name is \{first}". Yes, I know the oddball edge cases each style supports are slightly different... but that's exactly the problem. It's yet another thing to address an ever-so-slightly different case, where the actual differences are impossible to explain to students; and where there's frankly nothing you can't do with just a couple extra characters using str.format() right now. Yours, David... On Sun, Jul 19, 2015 at 4:12 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
data:image/s3,"s3://crabby-images/2dd36/2dd36bc2d30d53161737124e2d8ace2b4b4ce052" alt=""
On Aug 6, 2015 3:03 PM, "Guido van Rossum" <guido@python.org> wrote:
Unfortunately, all spellings that require calling locals() are wrong.
Is this where the potential source of surprising error is? * Explicit / Implicit locals() * To me, the practicality of finding '%' and .format is more important than the convenience of an additional syntax with implicit scope, but is that beside the point?
On Thu, Aug 6, 2015 at 8:57 PM, David Mertz <mertz@gnosis.cx> wrote:
I've followed all the posts in this thread, and although my particular
opinion has little significance, I'm definitely -1 on this idea (or actually -1000). thing... let alone one with weird semantics with lots of edge cases that are almost impossible to teach.
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 7 August 2015 at 06:35, Wes Turner <wes.turner@gmail.com> wrote:
Yes - it's what creates the temptation for people to use sys._getframe() to hide the locals() calls, and either approach hides the name references from lexical analysers (hence Barry's comment about false alarms regarding "unused locals" when scanning code that uses flufl.il8n). When people are being tempted to write code that is too clever for a computer to easily follow without executing it, that's cause for concern (being able to *write* such code is useful for exploratory purposes, but it's also the kind of thing that's desirable to factor out as a code base matures). When it comes to name based string interpolation, the current "correct" approach (which even computers can read) requires duplicating the name references explicitly in constructs like: print("This interpolates {a} and {b}".format(a=a, b=b)) Which doesn't fare well for readability when compared to sys._getframe() based implicit approaches like flufl.il8n's: print(_("This interpolates $a and $b")) The f-string proposal provides a way to write the latter kind of construct in a more explicit way that even computers can learn to read (without executing it). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Aug 6, 2015 at 10:35 PM, Wes Turner <wes.turner@gmail.com> wrote:
This is a big deal because of the worry about code injection. A "classic" format string given access to locals() (e.g. using s.format(**locals())) always stirs worries about code injection if the string is a variable. The proposed forms of string interpolation don't give access to locals *other than the locals where the string "literal" itself exists*. This latter access is no different from the access to locals in any expression. (The same for globals(), of course.) The other issue with explicit locals() is that to the people who would most benefit from variable interpolation (typically relatively unsophisticated users), it is magical boilerplate. (Worse, it's boilerplate that their more experienced mentors will warn them against because of the code injection worry.)
I'm not sure what your point is here. (Genuinely not sure -- this is not a rhetorical flourish.) Are you saying that you prefer the explicit formatting operation because it acts as a signal to the reader that formatting is taking place? Maybe in the end the f-string proposal is the right one -- it's minimally obtrusive and yet explicit, *and* backwards compatible? This isn't saying I'm giving up on always-interpolation; there seems to be at least an even split between languages that always interpolate (PHP?), languages that have a way to explicitly disable it (like single quotes in shell), and languages that require some sort of signal (like C#). -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Fri, Aug 7, 2015 at 6:12 PM, Guido van Rossum <guido@python.org> wrote:
PHP, like shell languages, has "interpolated strings with $double $quotes" and 'uninterpreted strings with single quotes'. At my last PHP job, the style guide eschewed any use of double quoted strings, but that job's style guide wasn't something I'd recommend, so that may not be all that significant. (Part of the problem was that one of the programmers used string interpolation in ways that killed readability, so I can understand the complaint.) ChrisA
data:image/s3,"s3://crabby-images/2dd36/2dd36bc2d30d53161737124e2d8ace2b4b4ce052" alt=""
On Aug 7, 2015 3:13 AM, "Guido van Rossum" <guido@python.org> wrote:
format string given access to locals() (e.g. using s.format(**locals())) always stirs worries about code injection if the string is a variable. The proposed forms of string interpolation don't give access to locals *other than the locals where the string "literal" itself exists*. This latter access is no different from the access to locals in any expression. (The same for globals(), of course.)
The other issue with explicit locals() is that to the people who would
* To me, the practicality of finding '%' and .format is more important
most benefit from variable interpolation (typically relatively unsophisticated users), it is magical boilerplate. (Worse, it's boilerplate that their more experienced mentors will warn them against because of the code injection worry.) than the convenience of an additional syntax with implicit scope, but is that beside the point?
I'm not sure what your point is here. (Genuinely not sure -- this is not
a rhetorical flourish.) Are you saying that you prefer the explicit formatting operation because it acts as a signal to the reader that formatting is taking place? I should prefer str.format() when I reach for str.__mod__() because it's more likely that under manual review I'll notice or grep ".format(" than "%", sheerly by character footprint.
Maybe in the end the f-string proposal is the right one -- it's minimally
obtrusive and yet explicit, *and* backwards compatible? This isn't saying I'm giving up on always-interpolation; there seems to be at least an even split between languages that always interpolate (PHP?), languages that have a way to explicitly disable it (like single quotes in shell), and languages that require some sort of signal (like C#). A convenient but often dangerous syntactical shortcut (because it is infeasible to track more than 7+-2 glocal variables in mind at once). * Jinja2 autoescaping w/ LaTeX code is much easier w/ different operators. * f'... {Cmd}"' * r'... {Cmd}"' 0 / O
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/50535/5053512c679a1bec3b1143c853c1feacdabaee83" alt=""
On Aug 07, 2015, at 10:12 AM, Guido van Rossum wrote:
I took a look at the Mailman trunk. It's definitely the case that the majority of the uses of flufl.i18n's string interpolation are with in-place literals. A few examples of where a variable is passed in instead: * An error notification where some other component calculates the error message and is passed to a generic reporting function. The error message may be composed from several literal bits and pieces. * Translate a template read from a data file. I'd put this in the camp of consenting adults. It's useful and rare, so if I saw non-literals in a code review, I'd question it, but probably not disallow it. I'd want to spend extra time reviewing the code to be assured it's not a vector for code injections.
Which is why I think it can't be implicit for all strings. E.g. in an i18n context, seeing _('$person did $something') is a very explicit marker.
Although I didn't say it, I'd answer this question "yes". Cheers, -Barry
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 08/07/2015 04:12 AM, Guido van Rossum wrote:
I think one of the advantages of f-strings is they are explicitly created in the context of where the scope is defined. That scope includes non-locals too. So locals, and globals is a narrower selection than the defined static scope. Non-locals can't mask globals if they aren't included. So "...".format(*locals(), **globals()) is not the same as when the names are explicitly supplied as keywords. If it is opened up to dynamic scope, all bets are off. That hasn't been suggested, but when functions use locals and globals as arguments, I think that is the security concern. One of questions I have is, will there be a way to create an f-string other than by a literal. So far, I think no, because it's not an object, but a protocol. f"..." ---> "...".format(...). That doesn't mean we can't have a function to do that. Currently the best way would be to do eval('f"..."'), but it wouldn't be exactly the same because eval does not include the non-local part of the scope. It seems that hasn't been an an issue for other things, so maybe it's not an issue here as well. If all strings get scanned, I think it could complicate how strings act in various contexts. For example when a string is used both as f-string and then gets used again as a template or pattern. That suggests there should be a way to turn that scanning off for some strings. (?) So far I'm -1 on all strings, but +.25 on the explicit f-string. (Still waiting to see the PEP before I give it a full +1.) Cheers, Ron
data:image/s3,"s3://crabby-images/81ad4/81ad4416021dae804321cb45859565579f3ff068" alt=""
Guido van Rossum <guido@python.org> writes:
Googling e.g., "python locals code injection" yields nothing specific: http://stackoverflow.com/questions/2515450/injecting-variables-into-the-call... http://stackoverflow.com/questions/13312240/is-a-string-formatter-that-pulls... Could you provide an example what is wrong with "{a}{b}".format(**vars())? Is it correct to say that there is nothing wrong with it as long as the string is always a *literal*?
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Could you provide an example what is wrong with "{a}{b}".format(**vars())?
["{a}{b}".format(**vars()) for _ in range(1)]
Comprehensions have their own scope. This needs to be a compile-time transform into a normal variable lookup. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Akira Li<mailto:4kir4.1i@gmail.com> Sent: 8/7/2015 18:55 To: python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format Guido van Rossum <guido@python.org> writes:
Googling e.g., "python locals code injection" yields nothing specific: https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fstackoverflow.com%2fquestions%2f2515450%2finjecting-variables-into-the-callers-scope&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=WJJTSsqvRuTy9ZCKgDPNfqp8rC2032i%2fudmnZ%2bG%2bMZg%3d https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fstackoverflow.com%2fquestions%2f13312240%2fis-a-string-formatter-that-pulls-variables-from-its-calling-scope-bad-practice&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=RZKdUQiJRzlp%2bikOPERDJzX8facaBRWuf1brLXy0D6M%3d Could you provide an example what is wrong with "{a}{b}".format(**vars())? Is it correct to say that there is nothing wrong with it as long as the string is always a *literal*? _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.python.org%2fmailman%2flistinfo%2fpython-ideas&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=Rwn2JLTjmWxnzx%2bp0zixk8gQprBYF3mcp8a%2fUhio1mY%3d Code of Conduct: https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fpython.org%2fpsf%2fcodeofconduct%2f&data=01%7c01%7csteve.dower%40microsoft.com%7ceb455eb18c7b4fe4c47b08d29f947ec5%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=DmWW4wAFmzYnI%2beEZSJcMVMgxGAojWSxyxP%2bVsusPfY%3d
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
Hi, Ok, I kept the message brief because I thought this subject had previously been discussed often. I've expanded it to explain better for those that are interested. --- Needed to whip-up some css strings, took a look at the formatting I had done and thought it was pretty ugly. I started with the printf style, and had pulled out the whitespace as vars in order to have a minification option: csstext += '%s%s%s{%s' % (nl, key, space, nl) Decent but not great, a bit hard on the eyes. So I decided to try .format(): csstext += '{nl}{key}{space}{{{nl}'.format(**locals()) This looks a bit better if you ignore the right half, but it is longer and not as simple as one might hope. It is much longer still if you type out the variables needed as kewword params! The '{}' option is not much improvement either. csstext += '{nl}{key}{space}{{{nl}'.format(nl=nl, key=key, ... # uggh csstext += '{}{}{}{{{}'.format(nl, key, space, nl) I've long wished python could format strings easily like bash or perl do, ... and then it hit me: csstext += f'{nl}{key}{space}{{{nl}' An "f-formatted" string could automatically format with the locals dict. Not yet sure about globals, and unicode only suggested for now. Perhaps could be done directly to avoid the .format() function call, which adds some overhead and tends to double the length of the line? I remember a GvR talk a few years ago giving a 'meh' on .format() and have agreed, using it only when I have a very large or complicated string-building need, at the point where it begins to overlap Jinja territory. Perhaps this is one way to make it more comfortable for everyday usage. I've seen others make similar suggestions, but to my knowledge they didn't include this pleasing brevity aspect. -Mike On 07/19/2015 04:27 PM, Eric V. Smith wrote:
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 20, 2015 at 9:35 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
Point to note: Currently, all the string prefixes are compile-time directives only. A b"bytes" or u"unicode" prefix affects what kind of object is produced, and all the others are just syntactic differences. In all cases, a string literal is a single immutable object which can be stashed away as a constant. What you're suggesting here is a thing that looks like a literal, but is actually a run-time operation. As such, I'm pretty dubious; coupled with the magic of dragging values out of the enclosing namespace, it's going to be problematic as regards code refactoring. Also, you're going to have heaps of people arguing that this should be a shorthand for str.format(**locals()), and about as many arguing that it should follow the normal name lookups (locals, nonlocals, globals, builtins). I'm -1 on the specific idea, though definitely sympathetic to the broader concept of simplified formatting of strings. Python's printf-style formatting has its own warts (mainly because of the cute use of an operator, rather than doing it as a function call), and still has the problem of having percent markers with no indication of what they'll be interpolating in. Anything that's explicit is excessively verbose, anything that isn't is cryptic. There's no easy fix. ChrisA
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
"Point to note: Currently, all the string prefixes are compile-time directives only. A b"bytes" or u"unicode" prefix affects what kind of object is produced, and all the others are just syntactic differences. In all cases, a string literal is a single immutable object which can be stashed away as a constant. What you're suggesting here is a thing that looks like a literal, but is actually a run-time operation." Why wouldn't this be a compile time transform from f"string with braces" into "string with braces".format(x=x, y=y, ...) where x, y, etc are the names in each pair of braces (with an error if it can't get a valid identifier out of each format code)? It's syntactic sugar for a simple function call with perfectly well defined semantics - you don't even have to modify the string literal. Defined as a compile time transform like this, I'm +1. As soon as any suggestion mentions "locals()" or "globals()" I'm -1. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Chris Angelico<mailto:rosuav@gmail.com> Sent: 7/19/2015 16:44 Cc: python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On Mon, Jul 20, 2015 at 9:35 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
Point to note: Currently, all the string prefixes are compile-time directives only. A b"bytes" or u"unicode" prefix affects what kind of object is produced, and all the others are just syntactic differences. In all cases, a string literal is a single immutable object which can be stashed away as a constant. What you're suggesting here is a thing that looks like a literal, but is actually a run-time operation. As such, I'm pretty dubious; coupled with the magic of dragging values out of the enclosing namespace, it's going to be problematic as regards code refactoring. Also, you're going to have heaps of people arguing that this should be a shorthand for str.format(**locals()), and about as many arguing that it should follow the normal name lookups (locals, nonlocals, globals, builtins). I'm -1 on the specific idea, though definitely sympathetic to the broader concept of simplified formatting of strings. Python's printf-style formatting has its own warts (mainly because of the cute use of an operator, rather than doing it as a function call), and still has the problem of having percent markers with no indication of what they'll be interpolating in. Anything that's explicit is excessively verbose, anything that isn't is cryptic. There's no easy fix. ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 20, 2015 at 10:43 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
It'd obviously have to be a compile-time transformation. My point is that it would, unlike all other forms of literal, translate into a function call. How is your "x=x, y=y" version materially different from explicitly mentioning locals() or globals()? The only significant difference is that your version follows the scope order outward, where locals() and globals() call up a specific scope each. Will an f"..." format string be mergeable with other strings? All the other types of literal can be (apart, of course, from mixing bytes and unicode), but this would have to be something somehow different. In every way that I can think of, this is not a literal - it is a construct that results in a run-time operation. A context-dependent operation, at that. That's why I'm -1 on this looking like a literal. ChrisA
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Chris Angelico wrote:
Excluding dictionary literals, of course. And class definitions. Decorators too, and arguably the descriptor protocol and __getattribute__ make things that look like attribute lookups into function calls. Python is littered with these, so I'm not sure that your point has any historical support.
Yes, it follows normal scoping rules and doesn't invent/define/describe new ones for this particular case. There is literally no difference between the function call version and the prefix version wrt scoping. As an example of why "normal rules" are better than "locals()/globals()", how would you implement this using just locals() and globals()?
Given that this is the current behaviour:
I don't mind saying "no" here, especially since the merging is done while compiling, but it would be possible to generate a runtime concatentation here. Again, you only "know" that code (currently) has no runtime effect because, well, because you know it. It's a change, but it isn't world ending.
In every way that I can think of, this is not a literal - it is a construct that results in a run-time operation.
Most new Python developers (with backgrounds in other languages) are surprised that "class" is a construct that results in a run-time operation, and would be surprised that writing a dictionary literal also results in a run-time operation if they ever had reason to notice. I believe the same would apply here.
A context-dependent operation, at that.
You'll need to explain this one for me - how is it "context-dependent" when you are required to provide a string prefix?
That's why I'm -1 on this looking like a literal.
I hope you'll reconsider, because I think you're working off some incorrect or over-simplified beliefs. (Though this reply isn't just intended for Chris, but for everyone following the discussion, so I hope *everyone* considers both sides.) Cheers, Steve
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
"return [locals()[x] for _ in range(1)]" I lost some quotes here around the x, but it doesn't affect the behavior - you still can't get outside the comprehension scope here. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Steve Dower<mailto:Steve.Dower@microsoft.com> Sent: 7/19/2015 18:49 To: Chris Angelico<mailto:rosuav@gmail.com> Cc: python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format Chris Angelico wrote:
Excluding dictionary literals, of course. And class definitions. Decorators too, and arguably the descriptor protocol and __getattribute__ make things that look like attribute lookups into function calls. Python is littered with these, so I'm not sure that your point has any historical support.
Yes, it follows normal scoping rules and doesn't invent/define/describe new ones for this particular case. There is literally no difference between the function call version and the prefix version wrt scoping. As an example of why "normal rules" are better than "locals()/globals()", how would you implement this using just locals() and globals()?
Given that this is the current behaviour:
I don't mind saying "no" here, especially since the merging is done while compiling, but it would be possible to generate a runtime concatentation here. Again, you only "know" that code (currently) has no runtime effect because, well, because you know it. It's a change, but it isn't world ending.
In every way that I can think of, this is not a literal - it is a construct that results in a run-time operation.
Most new Python developers (with backgrounds in other languages) are surprised that "class" is a construct that results in a run-time operation, and would be surprised that writing a dictionary literal also results in a run-time operation if they ever had reason to notice. I believe the same would apply here.
A context-dependent operation, at that.
You'll need to explain this one for me - how is it "context-dependent" when you are required to provide a string prefix?
That's why I'm -1 on this looking like a literal.
I hope you'll reconsider, because I think you're working off some incorrect or over-simplified beliefs. (Though this reply isn't just intended for Chris, but for everyone following the discussion, so I hope *everyone* considers both sides.) Cheers, Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Jul 20, 2015 at 11:33 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Dictionary/list display isn't a literal, and every time it's evaluated, you get a brand new object, not another reference to the same literal. Compare:
Class and function definitions are also not literals, although people coming from other languages are often confused by this. (I've seen people write functions down the bottom of the file that are needed by top-level code higher up. It's just something you have to learn - Python doesn't "declare" functions, it "defines" them.) Going the other direction, there are a few things that you might think are literals but aren't technically so, such as "2+3j", which is actually two literals (int and imaginary) and a binary operation; but constant folding makes them functionally identical to constants. The nearest equivalent to this proposal is tuple display, which can sometimes function almost like a literal:
This disassembles to a simple fetching of a constant. However, it's really just like list display plus constant folding - the compiler notices that it'll always produce the same tuple, so it optimizes it down to a constant. In none of these cases is a string ever anything other than a simple constant. That's why this proposal is a distinct change; all of the cases where Python has non-constants that might be thought of as constants, they contain expressions (or even statements - class/function definitions), and are syntactically NOT single entities. Now, that's not to say that it cannot possibly be done. But I personally am not in favour of it.
Sure, that's where following the scoping rules is better than explicitly calling up locals(). On the flip side, going for locals() alone means you can easily and simply protect your code against grabbing the "wrong" things, by simply putting it inside a nested function (which is what your list comp there is doing), and it's also easier to explain what this construct does in terms of locals() than otherwise (what if there are attribute lookups or subscripts?).
Fair enough. I wouldn't mind saying "no" here too - in the same way that it's a SyntaxError to write u"hello" b"world", it would be a SyntaxError to mix either with f"format string".
That's part of learning the language (which things are literals and which aren't). Expanding the scope of potential confusion is a definite cost; I'm open to the argument that the benefit justifies that cost, but it is undoubtedly a cost.
A context-dependent operation, at that.
You'll need to explain this one for me - how is it "context-dependent" when you are required to provide a string prefix?
def func1(): x = "world" return f"Hello, {x}!" def func2(): return f"Hello, {x}!" They both return what looks like a simple string, but in one, it grabs a local x, and in the other, it goes looking for a global. This is one of the potential risks of such things as decimal.Decimal literals, because literals normally aren't context-dependent, but the Decimal constructor can be affected by precision controls and such. Again, not a killer, but another cost.
That's why I'm -1 on this looking like a literal.
I hope you'll reconsider, because I think you're working off some incorrect or over-simplified beliefs. (Though this reply isn't just intended for Chris, but for everyone following the discussion, so I hope *everyone* considers both sides.)
Having read your above responses, I'm now -0.5 on this proposal. There is definite merit to it, but I'm a bit dubious that it'll end up with one of the problems of PHP code: the uncertainty of whether something is a string or a piece of code. Some editors syntax-highlight all strings as straight-forward strings, same color all the way, while others will change color inside there to indicate interpolation sites. Which is correct? At least here, the prefix on the string makes it clear that this is a piece of code; but it'll take editors a good while to catch up and start doing the right thing - for whatever definition of "right thing" the authors choose. Maybe my beliefs are over-simplified, in which case I'll be happy to be proven wrong by some immensely readable and clear real-world code examples. :) ChrisA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 10:43, Steve Dower <Steve.Dower@microsoft.com> wrote:
I'm opposed to a special case compile time transformation for string formatting in particular, but in favour of clearly-distinct-from-anything-else syntax for such utilities: https://mail.python.org/pipermail/python-ideas/2015-June/033920.html It would also need a "compile time import" feature for namespacing purposes, so you might be able to write something like: from !string import format # Compile time import # Compile time transformation that emits a syntax error for a malformed format string formatted = !format("string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call") # Equivalent explicit code (but without any compile time checking of format string validity) formatted = "string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call".format(name=name, lookups=lookups) The key for me is that any such operation *must* be transparent to the compiler, so it knows exactly what names you're looking up and can generate the appropriate references for them (including capturing closure variables if necessary). If it's opaque to the compiler, then it's no better than just using a string, which we can already do today. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
So, macros basically? The real ones, not #define. What's wrong with special casing text strings (a little bit more than they already have been)? Top-posted from my Windows Phone ________________________________ From: Nick Coghlan<mailto:ncoghlan@gmail.com> Sent: 7/19/2015 21:34 To: Steve Dower<mailto:Steve.Dower@microsoft.com> Cc: Chris Angelico<mailto:rosuav@gmail.com>; python-ideas@python.org<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On 20 July 2015 at 10:43, Steve Dower <Steve.Dower@microsoft.com> wrote:
I'm opposed to a special case compile time transformation for string formatting in particular, but in favour of clearly-distinct-from-anything-else syntax for such utilities: https://mail.python.org/pipermail/python-ideas/2015-June/033920.html It would also need a "compile time import" feature for namespacing purposes, so you might be able to write something like: from !string import format # Compile time import # Compile time transformation that emits a syntax error for a malformed format string formatted = !format("string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call") # Equivalent explicit code (but without any compile time checking of format string validity) formatted = "string with braces for {name} {lookups} transformed to a runtime <this str>.format(name=name, lookups=lookups) call".format(name=name, lookups=lookups) The key for me is that any such operation *must* be transparent to the compiler, so it knows exactly what names you're looking up and can generate the appropriate references for them (including capturing closure variables if necessary). If it's opaque to the compiler, then it's no better than just using a string, which we can already do today. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 14:46, Steve Dower <Steve.Dower@microsoft.com> wrote:
I've wished for a cleaner shell command invocation syntax many more times than I've wished for easier string formatting, but I *have* wished for both. Talking to the scientific Python folks, they've often wished for a cleaner syntax to create deferred expressions with the full power of Python's statement level syntax. Explicitly named macros could deliver all three of those, without the downsides of implicit globally installed macros that are indistinguishable from regular syntax. By contrast, the string prefix system is inherently cryptic (being limited to single letters only) and not open to extension and experimentation outside the reference interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 19, 2015, at 21:58, Nick Coghlan <ncoghlan@gmail.com> wrote:
MacroPy already gives you macros that are explicitly imported, and explicitly marked on use, and nicely readable. And it already works, with no changes to Python, and it includes a ton of little features that you'd never want to add to core Python. There are definitely changes to Python that could make it easier to improve MacroPy or start a competing project, but I think it would be more useful to identify and implement those changes than to try to build a macro system into Python itself. (Making it possible to work on the token level, or to associate bytes/text/tokens/trees/code with each other more easily, or to hook syntax errors and reprocess the bytes/text/tokens, etc. are some such ideas. But I think the most important stuff wouldn't be new features, but removing the annoyances that get in the way of trying to build the simplest possible new macro system from scratch for 3.5, and we probably can't know what those are until someone attempts to build such a thing.)
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 15:18, Andrew Barnert <abarnert@yahoo.com> wrote:
I see nothing explicit about https://pypi.python.org/pypi/MacroPy or the examples at https://github.com/lihaoyi/macropy#macropy, as it looks just like normal Python code to me, with no indication that compile time modifications are taking place. That's not MacroPy's fault - it *can't* readily be explicit the way I would want it to be if it's going to reuse the existing AST compiler to do the heavy lifting. However, I agree the MacroPy approach to tree transformations could be a good backend concept. I'd previously wondered how you'd go about embedding third party syntax like shell expressions or format strings, but eventually realised that combining an AST transformation syntax with string quoting works just fine there. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 19, 2015, at 22:43, Nick Coghlan <ncoghlan@gmail.com> wrote:
I suppose what I meant by explicit is things like using [] instead of () for macro calls and quick lambda definitions, s[""] for string interpolation, etc. Once you get used to it, it's usually obvious at a glance where code is using MacroPy.
I think you want to be able to hook the tokenizer here as well. If you want f"..." of !f"...", that's hard to do at the tree level or the text level; you'd have to do something like f("...") or "f..." instead. But at the token level, it should be trivial. (Well, the second one may not be _quite_ trivial, because I believe there are some cases where you get a !f error instead of a ! error and an f name; I'd have to check.) I'm pretty sure I could turn my user literal suffix hack into an f-prefix hack in 15 minutes or so. (Obviously it would be nicer if it were doable in a more robust way, and using a framework rather than rewriting all the boilerplate. But my point is that, even without any support at all, it's still not that hard.) I think you could also use token transformations to do a lot of useful shell-type expressions without quoting, although not full shell syntax; you'd have to play around with the limitations to see if they're worth the benefit of not needing to quote the whole thing. But as I said before, the real test would be trying to build the framework mentioned parenthetically above and see where it gets annoying and what could change between 3.5 and 3.6 to unannoyingize the code.
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 20 July 2015 at 18:01, Andrew Barnert <abarnert@yahoo.com> wrote:
Is this code using MacroPy for compile time transformations? data = source[lookup] You have no idea, and neither do I. Instead, we'd be relying on our rote memory to recognise certain *names* as being typical MacroPy operations - if someone defines a new transformation, our pattern recognition isn't going to trigger properly. It doesn't help that my rote memory is awful, so I *detest* APIs that expect me to have a good one and hence "just know" when certain operations are special and don't work the same way as other operations. My suggested "!(expr)" notation is based on the idea of providing an inline syntactic marker to say "magic happening here" (with the default anonymous transformation being to return the AST object itself).
I figured out that AST->AST is fine, as anything else can be handled as quoted string transformations, which then gets you all the nice benefits of strings literals (choice of single or double quotes, triple-quoting for multi-line strings, escape sequences with the option of raw strings, etc), plus a clear inline marker alerting the reader to the fact that you've dropped out of Python's normal syntactic restrictions. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Chris Angelico writes:
I'm -1 on the specific idea, though definitely sympathetic to the broader concept of simplified formatting of strings.
So does everybody. But we've seen many iterations: Perl/shell-style implicit interpolation apparently was right out from the very beginning of Python. The magic print statement was then deprecated in favor of a function. So I suppose it will be very hard to convince the BDFL (and anything implicit would surely need his approval) of anything but a function or an operator. We have the % operator taking a printf-style format string and a tuple of values to interpolate. It's compact and easy to use with position indexes into the tuple for short formats and few values, but is nearly unreadable and not easy to write for long formats with many interpolations, especially if they are repeated.
I think the operator is actually a useful feature, not merely "cute". It directs the focus to the format string, rather than the function call.
and still has the problem of having percent markers with no indication of what they'll be interpolating in.
Not so. We have the more modern (?) % operator that takes a format string with named format sequences and a dictionary. This seems to be close to what the OP wants: val = "readable simple formatting method" print("This is a %(val)s." % locals()) (which actually works at module level as well as within a function). I suppose the OP will claim that an explicit call to locals() is verbose and redundant, but if that really is a problem: def format_with_locals(fmtstr): return fmtstr % locals() (of course with a nice short name, mnemonic to the author). Or for format strings to be used repeatedly with different (global -- the "locals" you want are actually nonlocal relative to a method, so there's no way to get at them AFAICS) values, there's this horrible hack: >>> class autoformat_with_globals(str): ... def __pos__(self): ... return self % globals() ... >>> a = autoformat_with_globals("This is a %(description)s.") >>> description = "autoformatted string" >>> +a 'This is a autoformatted string.' with __neg__ and __invert__ as alternative horrible hacks. We have str.format. I've gotten used to str.format but for most of my uses mapped %-formatting would work fine. We have an older proposal for a more flexible form of templating using the Perl/shell-ish $ operator in format strings. And we have a large number of templating languages from web frameworks (Django, Jinja, etc). None of these seem universally applicable. It's ugly in one sense (TOOWTDI violation), but ISTM that positional % for short interactive use, mapped % for templating where the conventional format operators suffice, and str.format for maximum explicit flexibility in programs, with context-sensitive formatting of new types, is an excellent combination.
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
Automatically injecting from the locals or globals is a nice source of bugs. Explicit is better than implicit, especially in case where it can lead to security bugs. -1 --- Bruce Check out my new puzzle book: http://J.mp/ingToConclusions Get it free here: http://J.mp/ingToConclusionsFree (available on iOS) On Sun, Jul 19, 2015 at 4:35 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
data:image/s3,"s3://crabby-images/5dd46/5dd46d9a69ae935bb5fafc0a5020e4a250324784" alt=""
Hello, On Sun, 19 Jul 2015 16:35:01 -0700 Mike Miller <python-ideas@mgmiller.net> wrote: []
"Not sure" sounds convincing. Deal - let's keep being explicit rather than implicit. Brevity? def _(fmt, dict): return fmt.format(**dict) __ = globals() ___ = locals() foo = 42 _("{foo}", __()) If that's not terse enough, you can take Python3, and go thru Unicode planes looking for funky-looking letters, then you hopefully can reduce to .("{foo}", .()) Where dots aren't dots, but funky-looking letters. -- Best regards, Paul mailto:pmiscml@gmail.com
data:image/s3,"s3://crabby-images/29b39/29b3942a63eb62ccdbf1017071ca08bf05e5ca70" alt=""
Hmm, I prefer this recipe sent to me directly by joejev:
For yours I'd use the "pile of poo" character: ;) 💩("{foo}", _()) Both of these might be slower and a bit more awkward than the f'' idea, though I like them. As to the original post, a pyflakes-type script might be able to look for name errors to assuage concerns, but as I mentioned before I believe the task of matching string/vars is still necessary. -Mike On 07/19/2015 04:59 PM, Paul Sokolovsky wrote:
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/19/2015 07:35 PM, Mike Miller wrote:
Disclaimer: not well tested code. This code basically does what you want. It eval's the variables in the caller's frame. Of course you have to be able to stomach the use of sys._getframe() and eval(): ####################################### import sys import string class Formatter(string.Formatter): def __init__(self, globals, locals): self.globals = globals self.locals = locals def get_value(self, key, args, kwargs): return eval(key, self.globals, self.locals) # default to looking at the parent's frame def f(str, level=1): frame = sys._getframe(level) formatter = Formatter(frame.f_globals, frame.f_locals) return formatter.format(str) ####################################### Usage: foo = 42 print(f('{foo}')) def get_closure(foo): def _(): foo # hack: else we see the global 'foo' when calling f() return f('{foo}:{sys}') return _ print(get_closure('c')()) def test(value): print(f('value:{value:^20}, open:{open}')) value = 7 open = 3 test(4+3j) del(open) test(4+5j) Produces: 42 c:<module 'sys' (built-in)> value: (4+3j) , open:3 value: (4+5j) , open:<built-in function open> Eric.
data:image/s3,"s3://crabby-images/52bd8/52bd80b85ad23b22cd55e442f406b4f3ee8efd9f" alt=""
I would prefer something more like: def f(s): caller = inspect.stack()[1][0] return s.format(dict(caller.f_globals, **caller.f_locals)) On July 20, 2015 8:56:54 AM CDT, "Eric V. Smith" <eric@trueblade.com> wrote:
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 10:08 AM, Ryan Gonzalez wrote:
You need to use format_map (or **dict(...)). And ChainMap might be a better choice, it would take some benchmarking to know. Also, you don't get builtins using this approach. I'm using eval to exactly match what evaluating the variable in the parent context would give you. That might not matter depending on the actual requirements. But I agree there are multiple ways to do this, and several of them could be made to work. Mine might have fatal flaws that more testing would show. Eric.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 10:19 AM, Eric V. Smith wrote:
My quick testing comes up with this, largely based on the code by joejev: import sys import collections def f(str): frame = sys._getframe(1) return str.format_map(collections.ChainMap( frame.f_locals, frame.f_globals, frame.f_globals['__builtins__'].__dict__)) I'm not sure about the builtins, but this seems to work. Also, you might want to be able to pass in the frame depth to allow this to be callable more than 1 level deep. So, given that this is all basically possible to implement today (at the cost of using sys._getframe()), I'm -1 on adding any compiler tricks to support this via syntax. From what I know of PyPy, this should be supported there, albeit at a large performance cost. Eric.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Perhaps surprisingly, I find myself leaning in favor of the f'...{var}...' form. It is explicit in the variable name. Historically, the `x` notation as an alias for repr(x) was meant to play this role -- you'd write '...' + `var` + '...', but it wasn't brief enough, and the `` are hard to see. f'...' is more explicit, and can be combined with r'...' and b'...' (or both) as needed. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 01:25 PM, Guido van Rossum wrote:
We didn't implement b''.format(), for a variety of reasons. Mostly to do with user-defined types returning unicode from __format__, if I recall correctly. So the idea is that f'x:{a.x} y:{y}' would translate to bytecode that does: 'x:{a.x} y:{y}'.format(a=a, y=y) Correct? I think I could leverage _string.formatter_parser() to do this, although it's been a while since I wrote that. And I'm not sure what's available at compile time. But I can look into it. I guess the other option is to have it generate: 'x:{a.x} y:{y}'.format_map(collections.ChainMap(globals(), locals(), __builtins__)) That way, I wouldn't have to parse the string to pick out what variables are referenced in it, then have .format() parse it again. Eric.
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Eric V. Smith wrote:
That's exactly what I had in mind, at least. Indexing is supported in format strings too, so f'{a[1]}' also becomes '{a[1]}'.format(a=a), but I don't think there are any other strange cases here. I would vote for f'{}' or f'{0}' to just be a SyntaxError. I briefly looked into how this would be implemented and while it's not quite trivial/localized, it should be relatively straightforward if we don't allow implicit merging of f'' strings. If we wanted to allow implicit merging then we'd need to touch more code, but I don't see any benefit from allowing it at all, let alone enough to justify seriously messing with this part of the parser.
If you really want to go with the second approach, ChainMap isn't going to be sufficient, for example:
If the change also came with a dict-like object that will properly resolve variables from the current scope, that would be fine, but I don't think it can be constructed in terms of existing names. (Also bear in mind that other Python implementations do not necessarily provide sys._getframe(), so defining the lookup in terms of that would not be helpful either.) Cheers, Steve
Eric.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
(Our posts crossed, to some extent.) On Mon, Jul 20, 2015 at 8:41 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
[...]
+1 on that last sentence. But I prefer a slightly different way of implementing (see my reply to Eric).
Not sure what you mean by "implicit merging" -- if you mean literal concatenation (e.g. 'foo' "bar" == 'foobar') then I think it should be allowed, just like we support mixing quotes and r''. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 03:22 PM, Guido van Rossum wrote:
Right. And following up here to that email:
That is better. The trick is converting the string "a.x" to the expression a.x, which should be easy enough at compile time.
It would still probably be best to limit the syntax inside {} to exactly what regular .format() supports, to avoid confusing users.
The expressions supported by .format() are limited to attribute access and "indexing". We just need to enforce that same restriction here.
It would be easiest to not restrict the expressions, but then we'd have to maintain that restriction in two places. And now that I think about it, it's somewhat more complex than just expanding the expression. In .format(), this: '{a[0]}{b[c]}' is evaluated roughly as format(a[0]) + format(b['c']) So to be consistent with .format(), we have to fully parse at least the indexing out to see if it looks like a constant integer or a string. So given that, I think we should just support what .format() allows, since it's really not quite as simple as "evaluate the expression inside the braces".
If I understand it, I think the concern is: f'{a}{b}' 'foo{}' f'{c}{d}' would need to become: f'{a}{b}foo{{}}{c}{d}' So you have to escape the braces in non-f-strings when merging strings and any of them are f-strings, and make the result an f-string. But I think that's the only complication.
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 07/20/2015 03:52 PM, Eric V. Smith wrote:
And thinking about it yet some more, I think the easiest and most consistent thing to do would be to translate it like: f'{a[0]}{b[c]}' == '{[0]}{[c]}'.format(a, b) So: f'api:{sys.api_version} {a} size{sys.maxsize}' would become either: f'api:{.api_version} {} size{.maxsize}'.format(sys, a, sys) or f'api:{0.api_version} {1} size{0.maxsize}'.format(sys, a) The first one seems simpler. The second probably isn't worth the micro-optimization, and it may even be a pessimization. Eric.
data:image/s3,"s3://crabby-images/4d484/4d484377daa18e9172106d4beee4707c95dab2b3" alt=""
On Mon, Jul 20, 2015 at 11:41 AM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Maybe I'm missing something, but it seems this could just as reasonably be '{}'.format(a[1])? Is there a reason to prefer the other form over this? On Mon, Jul 20, 2015 at 1:20 PM, Eric V. Smith <eric@trueblade.com> wrote:
Or: f'api:{} {} size{}'.format(sys.api_version, a, sys.maxsize) Note that format strings don't allow variables in subscripts, so f'{a[n]}' ==> '{}'.format(a['n']) Also, the discussion has assumed that if this feature were added it necessarily must be a single character prefix. Looking at the grammar, I don't see that as a requirement as it explicitly defines multiple character sequences. A syntax like: format'a{b}c' formatted"""a{b} c""" might be more readable. There's no namespace conflict just as there is no conflict between raw string literals and a variable named r. --- Bruce Check out my new puzzle book: http://J.mp/ingToConclusions Get it free here: http://J.mp/ingToConclusionsFree (available on iOS)
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/20/2015 5:29 PM, Bruce Leban wrote:
Right. But why re-implement that, instead of making it: '{[n]}'.format(a)? I've convinced myself (and maybe no one else) that since you want this: a=[1,2] b={'c':42} f'{a[0]} {b[c]}' being the same as: '{} {}'.format(a[0], b['c']) that it would be easier to make it: '{[0]} {[c]}'.format(a, b) instead of trying to figure out that the numeric-looking '0' gets converted to an integer, and the non-numeric-looking 'c' gets left as a string. That logic already exists in str.format(), so let's just leverage it from there. It also means that you automatically will support the subset of expressions that str.format() already supports, with all of its limitations and quirks. But I now think that's a feature, since str.format() doesn't really support the same expressions as normal Python does (due to the [0] vs. ['c'] issue). And it's way easier to explain if f-strings support the identical syntax as str.format(). The only restriction is that all parameters must be named, and not numbered or auto-numbered. Eric.
data:image/s3,"s3://crabby-images/b8491/b8491be6c910fecbef774491deda81cc5f10ed6d" alt=""
Eric V. Smith wrote:
Right. But why re-implement that, instead of making it: '{[n]}'.format(a)?
Consider also the case of custom formatters. I've got one that overloads format_field, adds a units specifier in the format, which then uses our model units conversion and writes values in the current user-units of the system: x = body.x_coord # A "Double()" object with units of length. print(f'{x:length:.3f}') # Uses the "length" string to perform a units conversion much as "!r" would invoke "repr()". I think your proposal above handles my use case the most cleanly. Another Eric
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Eric V. Smith writes:
Yes, please! Guido's point that he wants no explicit use of locals(), etc, in the implementation took me a bit of thought to understand, but then I realized that it means a "macro" transformation with the resulting expression evaluated in the same environment as an explicit .format() would be. And that indeed makes the whole thing as explicit as invoking str.format would be. I don't *really* care what transformations are used to get that result, but DRYing this out and letting the __format__ method of the indexed object figure out the meaning of the format string makes me feel better about my ability to *think* about the meaning of an f"..." string. In particular, isn't it possible that a user class's __format__ might decide that *all* keys are strings? I don't see how the transformation Steve Dower proposed can possibly deal with that ambiguity. Another conundrum is that it's not obvious whether f"{a[01]}" is a SyntaxError (as it is with str.format) or equivalent to "{}".format(a['01']) (as my hypothetical user's class would expect).
data:image/s3,"s3://crabby-images/ab219/ab219a9dcbff4c1338dfcbae47d5f10dda22e85d" alt=""
On 7/21/2015 1:16 AM, Stephen J. Turnbull wrote:
Right. That is indeed the beauty of the thing. I now think locals(), etc. is a non-starter.
In today's world: '{a[0]:4d}'.format(a=a) the object who's __format__() method is being called is a[0], not a. So it's not up to the object to decide what the keys mean. That decision is being made by the ''.format() implementation. And that's also the way I'm envisioning it with f-strings.
It would still be a syntax error, in my imagined implementation, because it's really calling ''.format() to do the expansion. So here's what I'm thinking f'some-string' would expand to. As you note above, it's happening in the caller's context: new_fmt = remove_all_object_names_from_string(s) objs = find_all_objects_referenced_in_string(s) result = new_fmt.format(*objs) So given: X = namedtuple('X', 'name width') a = X('Eric', 10) value = 'some value' then: f'{a.name:*^{a.width}}:{value}' would become this transformed code: '{.name:*^{.width}}:{}'.format(*[a, a, value]) which would evaluate to: '***Eric***:some value' The transformation of the f-string to new_fmt and the computation of objs is the only new part. The transformed code above works today. Eric.
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 4:20 PM, Eric V. Smith <eric@trueblade.com> wrote:
I think Python can do more at compile time and translate f"Result1={expr1:fmt1};Result2={expr2:fmt2}" to bytecode equivalent of "Result1=%s;Result2=%s" % ((expr1).__format__(fmt1), (expr2).__format__(fmt2))
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
I'd rather keep the transform as simple as possible. If text formatting is your bottleneck, congratulations on fixing your network, disk, RAM and probably your users. Those who need to micro-optimize this code can do what you suggested by hand - there's no need for us to make our lives more complicated for the straw man who has a string formatting bottleneck and doesn't know enough to research another approach. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Alexander Belopolsky<mailto:alexander.belopolsky@gmail.com> Sent: 7/20/2015 18:40 To: Eric V. Smith<mailto:eric@trueblade.com> Cc: python-ideas<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On Mon, Jul 20, 2015 at 4:20 PM, Eric V. Smith <eric@trueblade.com<mailto:eric@trueblade.com>> wrote: And thinking about it yet some more, I think the easiest and most consistent thing to do would be to translate it like: f'{a[0]}{b[c]}' == '{[0]}{[c]}'.format(a, b) I think Python can do more at compile time and translate f"Result1={expr1:fmt1};Result2={expr2:fmt2}" to bytecode equivalent of "Result1=%s;Result2=%s" % ((expr1).__format__(fmt1), (expr2).__format__(fmt2))
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 10:10 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
If text formatting is your bottleneck, congratulations on fixing your network, disk, RAM and probably your users.
Thank you, but one of my servers just spent 18 hours loading 10GB of XML data into a database. Given that CPU was loaded 100% all this time, I suspect neither network nor disk and not even RAM was the bottleneck. Since XML parsing was done by C code and only formatting of database INSERT instructions was done in Python, I strongly suspect string formatting had a sizable carbon footprint in this case. Not all string formatting is done for human consumption.
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Tue, Jul 21, 2015 at 12:44 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
Well-known rule of optimization: Measure, don't assume. There could be something completely different that's affecting your performance. I'd be impressed and extremely surprised if the formatting of INSERT queries took longer than the execution of those same queries, but even if that is the case, it could be the XML parsing (just because it's in C doesn't mean it's inherently faster than any Python code), or the database itself, or suboptimal paging of virtual memory. Before pointing fingers anywhere, measure. Measure. Measure! ChrisA
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 10:53 PM, Chris Angelico <rosuav@gmail.com> wrote:
This is getting off-topic for this list, but you may indeed be surprised by the performance that kdb+ (kx.com) with PyQ (pyq.enlnt.com) can deliver. [Full disclosure: I am the author of PyQ, so sorry for a shameless plug.]
data:image/s3,"s3://crabby-images/6a07a/6a07a3cf75deda6a3a289adb19524f35123b6904" alt=""
Sounds like you deserve the congratulations then :) But when you've confirmed that string formatting is something that can be changed to improve performance (specifically parsing the format string in this case), you have options regardless of the default optimization. For instance, you probably want to preallocate a list, format and set each non-string item, then use .join (or if possible, write directly from the list without the intermediate step of producing a single string). Making f"" strings subtly faster isn't going to solve your performance issue, and while I'm not advocating wastefulness, this looks like a premature optimization, especially when put alongside the guaranteed heap allocations and very likely IO that are also going to occur. Cheers, Steve Top-posted from my Windows Phone ________________________________ From: Alexander Belopolsky<mailto:alexander.belopolsky@gmail.com> Sent: 7/20/2015 19:44 To: Steve Dower<mailto:Steve.Dower@microsoft.com> Cc: Eric V. Smith<mailto:eric@trueblade.com>; python-ideas<mailto:python-ideas@python.org> Subject: Re: [Python-ideas] Briefer string format On Mon, Jul 20, 2015 at 10:10 PM, Steve Dower <Steve.Dower@microsoft.com<mailto:Steve.Dower@microsoft.com>> wrote: If text formatting is your bottleneck, congratulations on fixing your network, disk, RAM and probably your users. Thank you, but one of my servers just spent 18 hours loading 10GB of XML data into a database. Given that CPU was loaded 100% all this time, I suspect neither network nor disk and not even RAM was the bottleneck. Since XML parsing was done by C code and only formatting of database INSERT instructions was done in Python, I strongly suspect string formatting had a sizable carbon footprint in this case. Not all string formatting is done for human consumption.
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Jul 20, 2015 at 11:16 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
One thing I know for a fact is that the use of % formatting instead of .format makes a significant difference in my applications. This is not surprising given these timings: $ python3 -mtimeit "'%d' % 2" 100000000 loops, best of 3: 0.00966 usec per loop $ python3 -mtimeit "'{}'.format(2)" 1000000 loops, best of 3: 0.216 usec per loop As a result, my rule of thumb is to avoid the use of .format in anything remotely performance critical. If f"" syntax is implemented as a sugar for .format - it will be equally useless for most of my needs. However, I think it can be implemented in a way that will make me consider switching away from % formatting.
participants (35)
-
Akira Li
-
Alexander Belopolsky
-
Andrew Barnert
-
Barry Warsaw
-
Ben Finney
-
Bruce Leban
-
C Anthony Risinger
-
Chris Angelico
-
David Mertz
-
Eric Fahlgren
-
Eric Snow
-
Eric V. Smith
-
Georg Brandl
-
Greg Ewing
-
Guido van Rossum
-
Joseph Jevnik
-
Mark Lawrence
-
Mike Miller
-
MRAB
-
Nick Coghlan
-
Oleg Broytman
-
Oscar Benjamin
-
Paul Sokolovsky
-
Ron Adam
-
Ryan Gonzalez
-
Sam O'Malley
-
Stefan Behnel
-
Stephen J. Turnbull
-
Steve Dower
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Tim Peters
-
Wes Turner
-
Xavier Combelle