[Python-ideas] If branch merging

Mon Jun 8 13:24:33 CEST 2015

On Jun 8, 2015, at 03:32, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 8 June 2015 at 14:24, Andrew Barnert <abarnert at yahoo.com> wrote:
>> 
>> The problem with general named subexpressions is that it inherently means a
>> side effect buried in the middle of an expression. While it's not
>> _impossible_ to do that in Python today (e.g., you can always call a
>> mutating method in a comprehension's if clause or in the third argument to a
>> function), but it's not common or idiomatic.
>> 
>> You could say this is a consulting-adults issue and you shouldn't use it in
>> cases where it's not deep inside an expression--but those are the actual
>> motivating cases, the ones where just "pull it out into a named assignment"
>> won't work. In fact, one of our three examples is:
>> 
>>   [b for a in iterable if (a.b as b)]
>> 
>> 
>> That's exactly the kind of place that you'd call non-idiomatic with a
>> mutating method call, so why is a binding not even worse?
> 
> Ah, but that's one of the interesting aspects of the idea: since
> comprehensions and generator expressions *already* define their own
> nested scope in Python 3 in order to keep the iteration variable from
> leaking, their named subexpressions wouldn't leak either :)
> 
> For if/elif clauses and while loops, the leaking would be a desired
> feature in order to make the subexpression available for use inside
> the following suite body.

Except it would also make the subexpression available for use _after_ the suite body. And it would give you a way to accidentally replace rather than shadow a variable from earlier in the function. So it really is just as bad as any other assignment or other mutation inside a condition.

> That would leave conditional expressions as the main suggested use
> case where leaking the named subexpressions might not be desirable.
> Without any dedicated syntax, the two ways that first come to mind for
> doing expression local named subexpressions would be:
> 
>    x = (lambda a=a: b if (a.b as b) else a.c)()
>    x = next((b if (a.b as b) else a.c) for a in (a,))
> 
> Neither of which would be a particularly attractive option.

Especially since if you're willing to introduce an otherwise-unnecessary scope, you don't even need this feature:

    x = (lambda b: b if b else a.c)(a.b)
    x = (lambda b=a.b: b if b else a.c)()

Or, of course, you can just define a reusable ifelse function somewhere:

    def defaultify(val, defaultval
        return val if val else defaultval

    x = defaultify(a.b, a.c)

> The other possibility that comes to mind is to ask the question: "What
> happens when a named subexpression appears as part of an argument list
> to a function call, or as part of a subscript operation, or as part of
> a container display?", as in:
> 
>    x = func(b if (a.b as b) else a.c)
>    x = y[b if (a.b as b) else a.c]
>    x = (b if (a.b as b) else a.c),
>    x = [b if (a.b as b) else a.c]
>    x = {b if (a.b as b) else a.c}
>    x = {'k': b if (a.b as b) else a.c}
> 
> Having *those* subexpressions leak seems highly questionable, so it
> seems reasonable to suggest that in order for this idea to be workable
> in practice, there would need to be some form of implicit scoping rule
> where using a named subexpression turned certain constructs into
> "scoped subexpressions" that implicitly created a function object and
> called it, rather than being evaluated inline as normal.

Now you really _are_ reinventing let. A let expression like this:

    x = let b=a.b in (b if b else a.c)

... is effectively just syntactic sugar for the lambda above. And it's a lot more natural and easy to reason about than letting b escape one step out to the conditional expression but not any farther. (Or to the rest of the complete containing expression? Or the statement? What does "x[(a.b as b)] = b" mean, for example? Or "x[(b if (a.b as b) else a.c) + (b if (d.b as b) else d.c)]"? Or "x[(b if (a.b as b) else a.c) + b]"?)

As a side note, the initial proposal here was to improve performance by not repeating the a.b lookup; I don't think adding an implicit comprehension-like function definition and call will be faster than a getattr except in very uncommon cases. However, I think there are reasonable cases where it's more about correctness than performance (e.g., the real expression you want to avoid evaluating twice is next(spam) or f.readline(), not a.b), so I'm not too concerned there. Also, I'm pretty sure a JIT could effectively inline a function definition plus call more easily than it could CSE an expression that's hard to prove is static.