[Python-ideas] Briefer string format

Bruce Leban bruce at leban.us
Fri Jul 24 03:57:25 CEST 2015


On Thu, Jul 23, 2015 at 7:22 AM, Steven D'Aprano <steve at pearwood.info>
wrote:

>
> If I had a dollar for everytime somebody on the Internet misused
> "strawman argument", I would be a rich man.


You wouldn't get a dollar here. If you want to be strict, a strawman
argument is misrepresenting an opponent's viewpoint to make it easier to
refute but it also applies to similar arguments. You stated that "constant
folding ... *would* change the semantics" *[emphasis added]*.

It's not a fact that constant folding must change the semantics as is
easily shown. And in fact, by definition, constant folding should never
change semantics. So the straw here is imagining that the implementer of
this feature would ignore the accepted rules regarding constant folding and
then criticizing the implementer for doing that.



> (1) Explicit concatenation with the + operator should be treated as
> occuring after the f strings are evaluated, *as if* the following
> occurs:
>
>     f'{spam}' + '{eggs}'
>     => compiles to format(spam) + '{eggs}'
>
> <snip>
>
> Do you agree with those semantics for explicit + concatenation? If not,
> what behaviour do you want?


I agree with that.

>
>

> (2) Implicit concatenation should occur as early as possible, before
> the format. Take the easy case first: both fragments are f-strings.
>
>     f'{spam}' f'{eggs}'
>     => behaves as if you wrote f'{spam}{eggs}'
>     => which compiles to format(spam) + format(eggs)
>
> Do you agree with those semantics for implicit concatenation?
>

Yes

(3) The hard case, when you mix f and non-f strings.
>
>     f'{spam}' '{eggs}'
>
> Notwithstanding raw strings, the behaviour which makes sense to me is
> that the implicit string concatenation occurs first, followed by format.
>

You talk about which happens "first" so let's recast this as an operator
precedence question. Think of f as a unary operator. Does f bind tighter
than implicit concatenation? Well, all other string operators like this
bind more tightly than concatenation.
f'{spam}' '{eggs}'


> Secondly, it feels that this does the concatenation in the wrong order.
> Implicit concatenation occurs as early as possible in every other case.
> But here, we're delaying the concatenation until after the format. So
> this feels wrong to me.
>

Implicit concatenation does NOT happen as early as possible in every case.
When I write:

    r'a\n' 'b\n'  ==>  'a\\nb\n'

the r is applied to the first string *before* the concatenation with the
second string.


> If there's no consensus on the behaviour of mixed f and non-f strings
> with implicit concatenation, rather than pick one and frustrate and
> surprise half the users, we should make it an error:
>
>     f'{spam}' '{eggs}'
>     => raises SyntaxError




> We can't just say that when the concatenation actually occurs is an
> optimization, as we can with raw and cooked string literals, because the
> f string is not a literal, it's actually a function call in disguise. So
> we have to pick one or the other (or refuse to guess and raise a syntax
> error).
>

Imagine that we have another prefix that escapes strings for regex. That is
e'a+b' ==> 'a\\+b'. This is another function call in disguise, just calling
re.escape. Applying your reasoning could have us conclude that e is just
like f and should infect all the other strings it is concatenated with. But
that would actually break the reason to have this in the first place,
writing strings like this:

    '(' e'1+2' '|' e'1*2' '){1,2}'

Perhaps you're thinking that e should be done at compile time. Well, when I
combine it with f, it clearly must be done at run-time:

    '(' ef'{foo}' '|' ef'{bar}' '){1,2}'

I'm not actually proposing an e prefix. I'm just speculating how it would
work if we had one. And combining e and f must mean do f then e because the
other order is useless, just as combining f and r must mean do r then f.

Maybe you can't say that concatenation is an optimization but I can (new
text underlined):

Multiple adjacent string or bytes literals (delimited by whitespace),
possibly using different quoting conventions, are allowed, and their
meaning is the same as their concatenation. ... Thus, "hello" 'world' is
equivalent to "helloworld". This feature can be used to reduce the number
of backslashes needed, to split long strings conveniently across long
lines, *to mix formatted and unformatted strings,* or even to add comments
to parts of strings, for example:

re.compile("[A-Za-z_]"       # letter or underscore
           "[A-Za-z0-9_]*"   # letter, digit or underscore
          )
Note that this feature is defined at the syntactical level, but implemented
at compile time *as an optimization*. The ‘+’ operator must be used to
concatenate string expressions at run time. Also note that literal
concatenation can use different quoting styles for each component (even
mixing raw strings and triple quoted strings). *If formatted strings are
mixed with unformatted strings, they are concatenated at compile time and
the unformatted parts are escaped so they will not be subject to format
substitutions.*

--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150723/90197579/attachment.html>


More information about the Python-ideas mailing list