[Python-ideas] Briefer string format
Guido van Rossum
guido at python.org
Tue Jul 21 15:03:43 CEST 2015
Thanks, Eric! You're addressing all my concerns and you're going exactly
where I wanted this to go. I hope that you will find the time to write up a
PEP; take your time. Regarding your [1], let's not consider unevaluated
f-strings as a feature; that use case is sufficiently covered by the
existing str.format().
On Tue, Jul 21, 2015 at 1:58 PM, Eric V. Smith <eric at trueblade.com> wrote:
> On 7/21/2015 2:05 AM, Guido van Rossum wrote:
> > And now that I think about it, it's somewhat more complex than just
> > expanding the expression. In .format(), this:
> > '{a[0]}{b[c]}'
> > is evaluated roughly as
> > format(a[0]) + format(b['c'])
> >
> >
> > Oooh, this is very unfortunate. I cannot support this. Treating b[c] as
> > b['c'] in a "real" format string is one way, but treating it that way in
> > an expression is just too weird.
>
> I think you're right here, and my other emails were trying too much to
> simplify the implementation and keep the parallels with str.format().
> The difference between str.format() and f-strings is that in
> str.format() you can have an arbitrarily complex expression as the
> passed in argument to .format(). With f-strings, you'd be limited to
> just what can be extracted from the string itself: there are no
> arguments to be passed in. So maybe we do want to allow arbitrary
> expressions inside the f-string.
>
> For example:
>
> '{a.foo}'.format(a=b[c])
>
> If we limit f-strings to just what str.format() string expressions can
> represent, it would be impossible to represent this with an f-string,
> without an intermediate assignment.
>
> But if we allowed arbitrary expressions inside an f-string, then we'd have:
> f'{b[c].foo}'
>
> and similarly:
> '{a.foo}'.format(a=b['c'])
> would become:
> f'{b["c"].foo}'
>
> But now we'd be breaking compatibility with str.format(). Maybe it's
> worth it, though. I can see 80% of the uses of str.format() being
> replaced by f-strings. The remainder would be cases where format strings
> are passed in to other functions. I do this a lot with custom logging [1].
>
> The implementation complexity goes up by allowing arbitrary expressions.
> Not that that is necessarily a reason to drive a design decision.
>
> For example:
> f'{a[2:3]:20d}'
>
> We need to extract the expression "a[2:3]" and the format spec "20d". I
> can't just scan for a colon any more, I've got to actually parse the
> expression until I find a "}", ":", or "!" that's not part of the
> expression so that I know where it ends. But since it's happening at
> compile time, I surely have all of the tools at my disposal. I'll have
> to look through the grammar to see what the complexities here are and
> where this would fit in.
>
> > So given that, I think we should just support what .format() allows,
> > since it's really not quite as simple as "evaluate the expression
> inside
> > the braces".
> >
> > Alas. And this is probably why we don't already have this feature.
>
> Agreed. So I think it's either "don't be compatible with str.format
> expressions" or "abandon the proposed f-strings".
>
> > > Not sure what you mean by "implicit merging" -- if you mean literal
> > > concatenation (e.g. 'foo' "bar" == 'foobar') then I think it
> should be
> > > allowed, just like we support mixing quotes and r''.
> >
> > If I understand it, I think the concern is:
> >
> > f'{a}{b}' 'foo{}' f'{c}{d}'
> >
> > would need to become:
> > f'{a}{b}foo{{}}{c}{d}'
> >
> > So you have to escape the braces in non-f-strings when merging
> strings
> > and any of them are f-strings, and make the result an f-string. But I
> > think that's the only complication.
> >
> >
> > That's possible; another possibility would be to just have multiple
> > .format() calls (one per f'...') and use the + operator to concatenate
> > the pieces.
>
> Right. I think the application would actually use _PyUnicodeWriter to
> build the string up, but it would logically be equivalent to:
>
> 'foo ' f'b:{b["c"].foo:20d} is {on_off}' ' bar'
>
> becoming:
>
> 'foo' + 'b:' + format(b["c"].foo, '20d') + ' is ' +
> format(on_off) + ' bar'
>
> At this point, the implementation wouldn't call str.format() because
> it's not being used to evaluate the expression. It would just call
> format() directly. And since it's doing that without having to look up
> .format on the string, we'd get some performance back that str.format()
> currently suffers from.
>
> Nothing is really lost by not merging the adjacent strings, since the
> f-strings by definition are replaced by function calls. Maybe the
> optimizer could figure out that 'foo ' + 'b:' could be merged in to 'foo
> b:'. Or maybe the user should refactor the strings if it's that important.
>
> I'm out of the office all day and won't be able to respond to any follow
> ups until later. But that's good, since I'll be forced to think before
> typing!
>
> Eric.
>
> [1] Which makes me think of the crazy idea of passing in unevaluated
> f-strings in to another function to be evaluated in their context. But
> the code injection opportunities with doing this with arbitrary
> user-specified strings are just too scary to think about. At least with
> str.format() you're limited in to what the expressions can do. Basically
> indexing and attribute access. No function calls: '{.exit()}'.format(sys) !
>
>
--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150721/3e859510/attachment-0001.html>
More information about the Python-ideas
mailing list