
On Thu, Jul 23, 2015 at 7:22 AM, Steven D'Aprano <steve@pearwood.info> wrote:
If I had a dollar for everytime somebody on the Internet misused "strawman argument", I would be a rich man.
You wouldn't get a dollar here. If you want to be strict, a strawman argument is misrepresenting an opponent's viewpoint to make it easier to refute but it also applies to similar arguments. You stated that "constant folding ... *would* change the semantics" *[emphasis added]*. It's not a fact that constant folding must change the semantics as is easily shown. And in fact, by definition, constant folding should never change semantics. So the straw here is imagining that the implementer of this feature would ignore the accepted rules regarding constant folding and then criticizing the implementer for doing that.
(1) Explicit concatenation with the + operator should be treated as occuring after the f strings are evaluated, *as if* the following occurs:
f'{spam}' + '{eggs}' => compiles to format(spam) + '{eggs}'
<snip>
Do you agree with those semantics for explicit + concatenation? If not, what behaviour do you want?
I agree with that.
(2) Implicit concatenation should occur as early as possible, before the format. Take the easy case first: both fragments are f-strings.
f'{spam}' f'{eggs}' => behaves as if you wrote f'{spam}{eggs}' => which compiles to format(spam) + format(eggs)
Do you agree with those semantics for implicit concatenation?
Yes (3) The hard case, when you mix f and non-f strings.
f'{spam}' '{eggs}'
Notwithstanding raw strings, the behaviour which makes sense to me is that the implicit string concatenation occurs first, followed by format.
You talk about which happens "first" so let's recast this as an operator precedence question. Think of f as a unary operator. Does f bind tighter than implicit concatenation? Well, all other string operators like this bind more tightly than concatenation. f'{spam}' '{eggs}'
Secondly, it feels that this does the concatenation in the wrong order. Implicit concatenation occurs as early as possible in every other case. But here, we're delaying the concatenation until after the format. So this feels wrong to me.
Implicit concatenation does NOT happen as early as possible in every case. When I write: r'a\n' 'b\n' ==> 'a\\nb\n' the r is applied to the first string *before* the concatenation with the second string.
If there's no consensus on the behaviour of mixed f and non-f strings with implicit concatenation, rather than pick one and frustrate and surprise half the users, we should make it an error:
f'{spam}' '{eggs}' => raises SyntaxError
We can't just say that when the concatenation actually occurs is an optimization, as we can with raw and cooked string literals, because the f string is not a literal, it's actually a function call in disguise. So we have to pick one or the other (or refuse to guess and raise a syntax error).
Imagine that we have another prefix that escapes strings for regex. That is e'a+b' ==> 'a\\+b'. This is another function call in disguise, just calling re.escape. Applying your reasoning could have us conclude that e is just like f and should infect all the other strings it is concatenated with. But that would actually break the reason to have this in the first place, writing strings like this: '(' e'1+2' '|' e'1*2' '){1,2}' Perhaps you're thinking that e should be done at compile time. Well, when I combine it with f, it clearly must be done at run-time: '(' ef'{foo}' '|' ef'{bar}' '){1,2}' I'm not actually proposing an e prefix. I'm just speculating how it would work if we had one. And combining e and f must mean do f then e because the other order is useless, just as combining f and r must mean do r then f. Maybe you can't say that concatenation is an optimization but I can (new text underlined): Multiple adjacent string or bytes literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same as their concatenation. ... Thus, "hello" 'world' is equivalent to "helloworld". This feature can be used to reduce the number of backslashes needed, to split long strings conveniently across long lines, *to mix formatted and unformatted strings,* or even to add comments to parts of strings, for example: re.compile("[A-Za-z_]" # letter or underscore "[A-Za-z0-9_]*" # letter, digit or underscore ) Note that this feature is defined at the syntactical level, but implemented at compile time *as an optimization*. The ‘+’ operator must be used to concatenate string expressions at run time. Also note that literal concatenation can use different quoting styles for each component (even mixing raw strings and triple quoted strings). *If formatted strings are mixed with unformatted strings, they are concatenated at compile time and the unformatted parts are escaped so they will not be subject to format substitutions.* --- Bruce