The `for y in [x]` idiom in comprehensions
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
Yet one discussion about reusing common subexpressions in comprehensions took place last week on the Python-ideas maillist (see topic "Temporary variables in comprehensions" [1]). The problem is that in comprehension like `[f(x) + g(f(x)) for x in range(10)]` the subexpression `f(x)` is evaluated twice. In normal loop you can introduce a temporary variable for `f(x)`. The OP wanted to add a special syntax for introducing temporary variables in comprehensions. This idea already was discussed multiple times in the past. There are several ways of resolving this problem with existing syntax. 1. Inner generator expression: result = [y + g(y) for y in (f(x) for x in range(10))] 2. The same, but with extracting the inner generator expression as a variable: f_samples = (f(x) for x in range(10)) result = [y+g(y) for y in f_samples] 3. Extracting the expression with repeated subexpressions as a function with local variables: def func(x): y = f(x) return y + g(y) result = [func(x) for x in range(10)] 4. Implementing the whole comprehension as a generator function: def gen(): for x in range(10): y = f(x) yield y + g(y) result = list(gen()) 5. Using a normal loop instead of a comprehension: result = [] for x in range(10): y = f(x) result.append(y + g(y)) And maybe there are other ways. Stephan Houben proposed an idiom which looks similar to new hypothetic syntax: result = [y + g(y) for x in range(10) for y in [f(x)]] `for y in [expr]` in a comprehension means just assigning expr to y. I never seen this idiom before, but it can be a good replacement for a hypothetic syntax for assignment in comprehensions. It changes the original comprehension less than other approaches, just adds yet one element in a sequence of for-s and if-s. I think that after using it more widely it will become pretty idiomatic. I have created a patch that optimizes this idiom, making it as fast as a normal assignment. [2] Yury suggested to ask Guido on the mailing list if he agrees that this language patten is worth optimizing/promoting. [1] https://mail.python.org/pipermail/python-ideas/2018-February/048971.html [2] https://bugs.python.org/issue32856
data:image/s3,"s3://crabby-images/f3b2e/f3b2e2e3b59baba79270b218c754fc37694e3059" alt=""
This thing has bitten me in the past - At the time I put together the "stackfull" package - if allows stuff like: from stackfull import push, pop ... [push(f(x)) + g(pop()) for x in range(10)] It is painfully simple in its workings: it creates a plain old list in the fame f_locals and uses that as a stack in all stackfull.* operations. Just posting because people involved in this thread might want to experiment with that. (it is on pypi) js -><- On 22 February 2018 at 16:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
Yet one discussion about reusing common subexpressions in comprehensions took place last week on the Python-ideas maillist (see topic "Temporary variables in comprehensions" [1]). The problem is that in comprehension like `[f(x) + g(f(x)) for x in range(10)]` the subexpression `f(x)` is evaluated twice. In normal loop you can introduce a temporary variable for `f(x)`. The OP wanted to add a special syntax for introducing temporary variables in comprehensions. This idea already was discussed multiple times in the past.
There are several ways of resolving this problem with existing syntax.
1. Inner generator expression:
result = [y + g(y) for y in (f(x) for x in range(10))]
2. The same, but with extracting the inner generator expression as a variable:
f_samples = (f(x) for x in range(10)) result = [y+g(y) for y in f_samples]
3. Extracting the expression with repeated subexpressions as a function with local variables:
def func(x): y = f(x) return y + g(y) result = [func(x) for x in range(10)]
4. Implementing the whole comprehension as a generator function:
def gen(): for x in range(10): y = f(x) yield y + g(y) result = list(gen())
5. Using a normal loop instead of a comprehension:
result = [] for x in range(10): y = f(x) result.append(y + g(y))
And maybe there are other ways.
Stephan Houben proposed an idiom which looks similar to new hypothetic syntax:
result = [y + g(y) for x in range(10) for y in [f(x)]]
`for y in [expr]` in a comprehension means just assigning expr to y. I never seen this idiom before, but it can be a good replacement for a hypothetic syntax for assignment in comprehensions. It changes the original comprehension less than other approaches, just adds yet one element in a sequence of for-s and if-s. I think that after using it more widely it will become pretty idiomatic.
I have created a patch that optimizes this idiom, making it as fast as a normal assignment. [2] Yury suggested to ask Guido on the mailing list if he agrees that this language patten is worth optimizing/promoting.
[1] https://mail.python.org/pipermail/python-ideas/2018-February/048971.html [2] https://bugs.python.org/issue32856
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br
data:image/s3,"s3://crabby-images/dd81a/dd81a0b0c00ff19c165000e617f6182a8ea63313" alt=""
On 02/22/2018 11:54 AM, Joao S. O. Bueno wrote:
On 22 February 2018 at 16:04, Serhiy Storchaka wrote:
Stephan Houben proposed an idiom which looks similar to new hypothetic syntax:
result = [y + g(y) for x in range(10) for y in [f(x)]]
This thing has bitten me in the past -
Do you recall how? That would be useful information. -- ~Ethan~
data:image/s3,"s3://crabby-images/50535/5053512c679a1bec3b1143c853c1feacdabaee83" alt=""
On Feb 22, 2018, at 11:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
Stephan Houben proposed an idiom which looks similar to new hypothetic syntax:
result = [y + g(y) for x in range(10) for y in [f(x)]]
`for y in [expr]` in a comprehension means just assigning expr to y. I never seen this idiom before, but it can be a good replacement for a hypothetic syntax for assignment in comprehensions. It changes the original comprehension less than other approaches, just adds yet one element in a sequence of for-s and if-s. I think that after using it more widely it will become pretty idiomatic.
My questions are 1) will this become idiomatic enough to be able to understand at a glance what is going on, rather than having to pause to reason about what that 1-element list-like syntax actually means, and 2) will this encourage even more complicated comprehensions that are less readable than just expanding the code into a for-loop? for-loops-are-not-evil-ly y’rs, -Barry
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Barry Warsaw writes:
My questions are 1) will this become idiomatic enough to be able to understand at a glance what is going on,
Is it similar enough to def f(x=[0]): which is sometimes seen as a way to produce a mutable default value for function arguments, to be "idiomatic"?
rather than having to pause to reason about what that 1-element list-like syntax actually means, and 2) will this encourage even more complicated comprehensions that are less readable than just expanding the code into a for-loop?
Of course it will encourage more complicated comprehensions, and we know that complexity is less readable. On the other hand, a for loop with a temporary variable will take up at least 3 statements vs. a one-statement comprehension. I don't have an opinion about the equities there. I myself will likely use the [(y, f(y)) for x in xs for y in costly(x)] idiom very occasionally, with emphasis on "very" (for almost all "costly" functions I might use that's the Knuthian root of error). But I don't know how others feel about it. Steve
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Stephen J. Turnbull schrieb am 23.02.2018 um 03:31:
Barry Warsaw writes:
rather than having to pause to reason about what that 1-element list-like syntax actually means, and 2) will this encourage even more complicated comprehensions that are less readable than just expanding the code into a for-loop?
Of course it will encourage more complicated comprehensions, and we know that complexity is less readable. On the other hand, a for loop with a temporary variable will take up at least 3 statements vs. a one-statement comprehension.
IMHO, any complex comprehension should be split across multiple lines, definitely if it uses multiple for-loops, as in the discussed example. So the "space win" of a complex comprehension that requires temporary values over a multi-line for-statement is actually not big in these cases. There are certainly cases where a comprehension still looks better, but I'm all for not encouraging a hacky idiom to stuff more into a comprehension. Comprehensions should be used to *improve* readabilty, not to reduce it. Stefan
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 23 February 2018 at 09:12, Stefan Behnel <stefan_ml@behnel.de> wrote:
Stephen J. Turnbull schrieb am 23.02.2018 um 03:31:
Barry Warsaw writes:
rather than having to pause to reason about what that 1-element list-like syntax actually means, and 2) will this encourage even more complicated comprehensions that are less readable than just expanding the code into a for-loop?
Of course it will encourage more complicated comprehensions, and we know that complexity is less readable. On the other hand, a for loop with a temporary variable will take up at least 3 statements vs. a one-statement comprehension.
IMHO, any complex comprehension should be split across multiple lines, definitely if it uses multiple for-loops, as in the discussed example. So the "space win" of a complex comprehension that requires temporary values over a multi-line for-statement is actually not big in these cases.
There are certainly cases where a comprehension still looks better, but I'm all for not encouraging a hacky idiom to stuff more into a comprehension. Comprehensions should be used to *improve* readabilty, not to reduce it.
In my view: 1. The proposal is for an optimisation, not a change to the language. So anything bad that can be done after the change, can be done now. 2. I doubt many people avoid this construct at the moment because it's slow, it's more likely they do so because they hadn't thought of it, or because it harms readability. 3. Announcing that this construct is no longer slow might encourage some extra people to use it (because they now know about it, and they assume that the fact that we've optimised it implies we think it's a good idea). 4. Ultimately, readability will be the main factor here. And readability is subjective, so we sort of have to trust people to use their common sense. This could easily be a premature optimisation. But on the other hand, it's a case of not making things unexpectedly slow, so I'm fine with that. If Serihy doesn't feel that the optimisation code is a major maintenance burden, I'd say go for it. It's a minor quality of life improvement for a niche case, let's not view (or promote) it as anything more than that. Paul
data:image/s3,"s3://crabby-images/a03e9/a03e989385213ae76a15b46e121c382b97db1cc3" alt=""
Is it similar enough to
def f(x=[0]):
No, not at all — it’s a very different use case. When I first saw this on the original thread, I needed to stare at it a good while, and then whip up some code to experiment with it to know what it did. And not because I don’t know what a single element list means, or what it means to iterate over a single element list, or what two fors mean in a comprehension. I was confused by the ‘x’ in the second iterable. I guess I’m (still) not really clear on the scope(s) inside a comprehension, and when the elements get evaluated in a list. I expected that the list would be created once, with the value x had initially, rather than getting the-evaluated each time through the outer loop. So I think that it is a very confusing use of comprehensions, and always will be. I’m still surprised it’s legal. Anyone know if this being allowed was deliberate or just kind of fell out of the implementation? So no, I don’t think it should be promoted as idiomatic. All that being said, it’s valid Python, so why not optimize it? -CHB
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
As to the validity or legality of this code, it's both, and working as intended. A list comprehension of the form [STUFF for VAR1 in SEQ1 for VAR2 in SEQ2 for VAR3 in SEQ3] should be seen (informally) as for VAR1 in SEQ1: for VAR2 in SEQ2: for VAR3 in SEQ3: "put STUFF in the result" (If there are `if COND` phrases too those get inserted into the nested set of blocks where they occur in the sequence.) On Fri, Feb 23, 2018 at 9:41 AM, Chris Barker - NOAA Federal < chris.barker@noaa.gov> wrote:
Is it similar enough to
def f(x=[0]):
No, not at all — it’s a very different use case.
When I first saw this on the original thread, I needed to stare at it a good while, and then whip up some code to experiment with it to know what it did.
And not because I don’t know what a single element list means, or what it means to iterate over a single element list, or what two fors mean in a comprehension.
I was confused by the ‘x’ in the second iterable. I guess I’m (still) not really clear on the scope(s) inside a comprehension, and when the elements get evaluated in a list.
I expected that the list would be created once, with the value x had initially, rather than getting the-evaluated each time through the outer loop.
So I think that it is a very confusing use of comprehensions, and always will be. I’m still surprised it’s legal. Anyone know if this being allowed was deliberate or just kind of fell out of the implementation?
So no, I don’t think it should be promoted as idiomatic.
All that being said, it’s valid Python, so why not optimize it?
-CHB _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/a03e9/a03e989385213ae76a15b46e121c382b97db1cc3" alt=""
On Fri, Feb 23, 2018 at 9:51 AM, Guido van Rossum <guido@python.org> wrote:
As to the validity or legality of this code, it's both, and working as intended.
A list comprehension of the form
[STUFF for VAR1 in SEQ1 for VAR2 in SEQ2 for VAR3 in SEQ3]
should be seen (informally) as
for VAR1 in SEQ1: for VAR2 in SEQ2: for VAR3 in SEQ3: "put STUFF in the result"
Thanks -- right after posting, I realized that was the way to unpack it to understand it. I think my confusion came from two things: 1) I usually don't care in which order the loops are ordered -- i.e., that could be: for VAR3 in SEQ3: for VAR2 in SEQ2: for VAR1 in SEQ1: "put STUFF in the result" As I usually don't care, I have to think about it (and maybe experiment to be sure). (this is the old Fortran vs C order thing :-) 2) since it's a single expression, I wasn't sure of the evaluation order, so maybe (in my head) it could have been (optimized) to be: [STUFF for VAR1 in Expression_that_evaluates_to_an_iterable1 for VAR2 in Expression_that_evaluates_to_an_iterable2] and that could translate to: IT1 = Expression_that_evaluates_to_an_iterable1 IT2 = Expression_that_evaluates_to_an_iterable2 for VAR1 in IT1: for VAR2 in IT2: "put STUFF in the result" In which case, VAR1 would not be available to Expression_that_evaluates_to_an_iterable2. Maybe that was very wrong headed -- but that's where my head went -- and I'm not a Python newbie (maybe an oddity, though :-) ) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
There are useful things you can only do with comprehensions if the second for-loop can use the variable in the first for-loop. E.g. [(i, j) for i in range(10) for j in range(i)] On Fri, Feb 23, 2018 at 10:16 AM, Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Feb 23, 2018 at 9:51 AM, Guido van Rossum <guido@python.org> wrote:
As to the validity or legality of this code, it's both, and working as intended.
A list comprehension of the form
[STUFF for VAR1 in SEQ1 for VAR2 in SEQ2 for VAR3 in SEQ3]
should be seen (informally) as
for VAR1 in SEQ1: for VAR2 in SEQ2: for VAR3 in SEQ3: "put STUFF in the result"
Thanks -- right after posting, I realized that was the way to unpack it to understand it. I think my confusion came from two things:
1) I usually don't care in which order the loops are ordered -- i.e., that could be:
for VAR3 in SEQ3: for VAR2 in SEQ2: for VAR1 in SEQ1: "put STUFF in the result"
As I usually don't care, I have to think about it (and maybe experiment to be sure). (this is the old Fortran vs C order thing :-)
2) since it's a single expression, I wasn't sure of the evaluation order, so maybe (in my head) it could have been (optimized) to be:
[STUFF for VAR1 in Expression_that_evaluates_to_an_iterable1 for VAR2 in Expression_that_evaluates_to_an_iterable2]
and that could translate to:
IT1 = Expression_that_evaluates_to_an_iterable1 IT2 = Expression_that_evaluates_to_an_iterable2 for VAR1 in IT1: for VAR2 in IT2: "put STUFF in the result"
In which case, VAR1 would not be available to Expression_that_evaluates_ to_an_iterable2.
Maybe that was very wrong headed -- but that's where my head went -- and I'm not a Python newbie (maybe an oddity, though :-) )
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE <https://maps.google.com/?q=7600+Sand+Point+Way+NE&entry=gmail&source=g> (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/a03e9/a03e989385213ae76a15b46e121c382b97db1cc3" alt=""
On Fri, Feb 23, 2018 at 10:45 AM, Guido van Rossum <guido@python.org> wrote:
There are useful things you can only do with comprehensions if the second for-loop can use the variable in the first for-loop. E.g.
[(i, j) for i in range(10) for j in range(i)]
indeed -- and that is fairly common use-case in nested for loops -- so good to preserve this. But I still think the original: [g(y) for x in range(5) for y in [f(x)]] Is always going to be confusing to read. Though I do agree that it's not too bad when you unpack it into for loops: In [89]: for x in range(5): ...: for y in [f(x)]: ...: l.append(g(y)) BTW, would it be even a tiny bit more efficient to use a tuple in the inner loop? [g(y) for x in range(5) for y in (f(x),)] -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
data:image/s3,"s3://crabby-images/4cf20/4cf20edf9c3655e7f5c4e7d874c5fdf3b39d715f" alt=""
Chris Barker schrieb am 23.02.2018 um 20:23:
BTW, would it be even a tiny bit more efficient to use a tuple in the inner loop?
[g(y) for x in range(5) for y in (f(x),)]
Serhiy's optimisation does not use a loop at all anymore and folds it into a direct assignment "y=f(x)" instead. But in general, yes, changing a list iterable into a tuple is an improvement as tuples are more efficient to allocate. Haven't tried it in CPython (*), but it might make a slight difference for very short iterables, which are probably common. Although the execution of the loop body will likely dominate the initial allocation by far. Stefan (*) I implemented this list->tuple transformation in Cython a while ago, but seeing Serhiy's change now got me thinking that this could be further improved into a stack allocated C array, to let the C compiler unroll the loop at will. I'll probably try that at some point... https://github.com/cython/cython/issues/2117
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 24 February 2018 at 06:50, Stefan Behnel <stefan_ml@behnel.de> wrote:
But in general, yes, changing a list iterable into a tuple is an improvement as tuples are more efficient to allocate. Haven't tried it in CPython (*), but it might make a slight difference for very short iterables, which are probably common.
CPython has included the list->tuple conversion for lists of literals for quite some time, and Serhiy just posted a patch to extend that to all inline lists where it's a safe change to make: https://bugs.python.org/issue32925 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Chris Barker writes:
But I still think the original:
[g(y) for x in range(5) for y in [f(x)]]
Is always going to be confusing to read.
But the point I was making with "def f(x=[0]):" was this: you have a situation where your desired semantics is "value of some type"[1], but the language's syntax doesn't permit a value of that type there, while "singleton sequence of that type" works fine. In fact, "singleton as value" is baked into Python in the form of str.__getitem__ and bytes.__getitem__. So we now have four use cases for singleton as value: two stringish actual types, and the two idioms "mutable default argument" and "local variable in comprehension". The horse is long since out of the barn. Steve Footnotes: [1] Both "value" and "type" are used rather loosely here.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Fri, Feb 23, 2018 at 11:23:04AM -0800, Chris Barker wrote:
But I still think the original:
[g(y) for x in range(5) for y in [f(x)]]
Is always going to be confusing to read. Though I do agree that it's not too bad when you unpack it into for loops:
In [89]: for x in range(5): ...: for y in [f(x)]: ...: l.append(g(y))
I think we should distinguish between: * actively confusing; and * merely not obvious at a glance. I acknowledge that this idiom is not obvious at a glance, but I don't think this comes even close to actively confusing. As you say, once you mentally unpack the loops it becomes clear. Given a potentially expensive DRY violation like: [(function(x), function(x)+1) for x in sequence] there are at least five ways to solve it. * Perhaps it doesn't need solving; if the DRY violation is trivial enough, and the cost low enough, who cares? * Re-write as a for-loop instead of a comprehension; * Use a helper function: def helper(x) tmp = function(x) return (tmp, tmp+1) [helper(x) for x in sequence] * Chain the operations: [(a, a+1) for a in (function(x) for x in sequence)] [(a, a+1) for a in map(function, sequence)] * Use a second loop to get an assignment: [(a, a+1) for x in sequence for a in [function(x)]] I don't think we need to promote any one of the above as the One True idiom. They're all simple, more-or-less obvious or at least understandable, and deciding between them should be a matter of personal choice and in-house style guides.
BTW, would it be even a tiny bit more efficient to use a tuple in the inner loop?
[g(y) for x in range(5) for y in (f(x),)]
The suggested patch will recognise both `y in (a,)` and `y in [a]` and treat them the same as a direct assignment `y=a`. But if you're writing cross-interpreter code which might run on older versions of Python, or implementations which may not have this optimization, you might prefer to micro-optimize by using a tuple. -- Steve
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
22.02.18 23:33, Barry Warsaw пише:
On Feb 22, 2018, at 11:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
Stephan Houben proposed an idiom which looks similar to new hypothetic syntax:
result = [y + g(y) for x in range(10) for y in [f(x)]]
`for y in [expr]` in a comprehension means just assigning expr to y. I never seen this idiom before, but it can be a good replacement for a hypothetic syntax for assignment in comprehensions. It changes the original comprehension less than other approaches, just adds yet one element in a sequence of for-s and if-s. I think that after using it more widely it will become pretty idiomatic.
My questions are 1) will this become idiomatic enough to be able to understand at a glance what is going on, rather than having to pause to reason about what that 1-element list-like syntax actually means, and 2) will this encourage even more complicated comprehensions that are less readable than just expanding the code into a for-loop?
I think everyone will have to pause when encounter this idiom the first time. Next time it will look more common. But the same is happened with other idioms like "lambda x=x:", "'...' % (x,)", "x = x or {}", etc. This is a correct Python syntax, and you don't need to know anything special, besides learned from the tutorial, for understanding it. All other alternatives (except the first one, which looks to me less readable than iterating a 1-element list) can't be used as an expression. Then require several statements. At least four statements in the case of a for-loop.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Feb 22, 2018 at 11:04 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
Yet one discussion about reusing common subexpressions in comprehensions took place last week on the Python-ideas maillist (see topic "Temporary variables in comprehensions" [1]). The problem is that in comprehension like `[f(x) + g(f(x)) for x in range(10)]` the subexpression `f(x)` is evaluated twice. In normal loop you can introduce a temporary variable for `f(x)`. The OP wanted to add a special syntax for introducing temporary variables in comprehensions. This idea already was discussed multiple times in the past.
There are several ways of resolving this problem with existing syntax.
1. Inner generator expression:
result = [y + g(y) for y in (f(x) for x in range(10))]
2. The same, but with extracting the inner generator expression as a variable:
f_samples = (f(x) for x in range(10)) result = [y+g(y) for y in f_samples]
3. Extracting the expression with repeated subexpressions as a function with local variables:
def func(x): y = f(x) return y + g(y) result = [func(x) for x in range(10)]
4. Implementing the whole comprehension as a generator function:
def gen(): for x in range(10): y = f(x) yield y + g(y) result = list(gen())
5. Using a normal loop instead of a comprehension:
result = [] for x in range(10): y = f(x) result.append(y + g(y))
And maybe there are other ways.
Stephan Houben proposed an idiom which looks similar to new hypothetic syntax:
result = [y + g(y) for x in range(10) for y in [f(x)]]
`for y in [expr]` in a comprehension means just assigning expr to y. I never seen this idiom before, but it can be a good replacement for a hypothetic syntax for assignment in comprehensions. It changes the original comprehension less than other approaches, just adds yet one element in a sequence of for-s and if-s. I think that after using it more widely it will become pretty idiomatic.
I have created a patch that optimizes this idiom, making it as fast as a normal assignment. [2] Yury suggested to ask Guido on the mailing list if he agrees that this language patten is worth optimizing/promoting.
[1] https://mail.python.org/pipermail/python-ideas/2018-February /048971.html [2] https://bugs.python.org/issue32856
I'm not saying anything new here, but since you asked specifically for my opinion: I don't care for the idiom; it's never occurred to me before, and it smells of cleverness. If I saw it in a code review I would probably ask for a regular for-loop to make the code more maintainable. But if you say it's useful for some class of users and it would be more useful if it was faster, I'm fine with the optimization. The optimization is also clever, and here I appreciate cleverness! -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
23.02.18 19:30, Guido van Rossum пише:
I'm not saying anything new here, but since you asked specifically for my opinion: I don't care for the idiom; it's never occurred to me before, and it smells of cleverness. If I saw it in a code review I would probably ask for a regular for-loop to make the code more maintainable.
But if you say it's useful for some class of users and it would be more useful if it was faster, I'm fine with the optimization. The optimization is also clever, and here I appreciate cleverness!
Thank you. Given the contradictory relation of other core developers to this idiom, and small total effect of this optimization (since the problem solved by using this idiom is rarely occurred), I'm inclined to defer this optimization on to some time (months or years). Maybe something will be changed during this period: either this idiom will become more popular, or new arguments against using it will be found, or better solution will be found, or this optimization will become the part of more general optimization.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Sun, Feb 25, 2018 at 6:36 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
23.02.18 19:30, Guido van Rossum пише:
I'm not saying anything new here, but since you asked specifically for my opinion: I don't care for the idiom; it's never occurred to me before, and it smells of cleverness. If I saw it in a code review I would probably ask for a regular for-loop to make the code more maintainable.
But if you say it's useful for some class of users and it would be more useful if it was faster, I'm fine with the optimization. The optimization is also clever, and here I appreciate cleverness!
Thank you. Given the contradictory relation of other core developers to this idiom, and small total effect of this optimization (since the problem solved by using this idiom is rarely occurred), I'm inclined to defer this optimization on to some time (months or years). Maybe something will be changed during this period: either this idiom will become more popular, or new arguments against using it will be found, or better solution will be found, or this optimization will become the part of more general optimization.
Yeah, it doesn't seem there's any hurry. Opinions on the idiom are definitely, um, divided. :-) FWIW I don't care much about the 'f(x) as y' solution either, and being new syntax it has a much higher bar. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/580fc/580fc23894999837a800c4c882392eed4b9574d8" alt=""
On Feb 22 2018, Serhiy Storchaka <storchaka@gmail.com> wrote:
1. Inner generator expression:
result = [y + g(y) for y in (f(x) for x in range(10))]
[...]
And maybe there are other ways.
I think the syntax recently brough up by Nick is still the most beautiful: result = [ (f(x) as y) + g(y) for x in range(10)] ..but I wonder if it is feasible to make the interpreter sufficiently smart to evaluate the first summand before the second. Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Sun, Feb 25, 2018 at 11:02 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Feb 22 2018, Serhiy Storchaka <storchaka@gmail.com> wrote:
1. Inner generator expression:
result = [y + g(y) for y in (f(x) for x in range(10))]
[...]
And maybe there are other ways.
I think the syntax recently brough up by Nick is still the most beautiful:
result = [ (f(x) as y) + g(y) for x in range(10)]
..but I wonder if it is feasible to make the interpreter sufficiently smart to evaluate the first summand before the second.
It already has to. The order of evaluation in Python is well defined, mostly "left to right". But if you allow this in a comprehension, the obvious next step will be "do we allow this in ANY expression?", and the answer has to either be "yes" or "no, because {reasons}" for some very good value of 'reasons'. ChrisA
data:image/s3,"s3://crabby-images/580fc/580fc23894999837a800c4c882392eed4b9574d8" alt=""
On Feb 25 2018, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Feb 25, 2018 at 11:02 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Feb 22 2018, Serhiy Storchaka <storchaka@gmail.com> wrote:
1. Inner generator expression:
result = [y + g(y) for y in (f(x) for x in range(10))]
[...]
And maybe there are other ways.
I think the syntax recently brough up by Nick is still the most beautiful:
result = [ (f(x) as y) + g(y) for x in range(10)]
..but I wonder if it is feasible to make the interpreter sufficiently smart to evaluate the first summand before the second.
It already has to. The order of evaluation in Python is well defined, mostly "left to right".
Ah, then the problem is how to evaluate result = [ y + g(f(x) as y) for x in range(10)] I don't think there'd be a good reason to allow one but not the other.
But if you allow this in a comprehension, the obvious next step will be "do we allow this in ANY expression?"
Yes, of course. After all, IIRC Nick proposed it to simplify ternary expressions. Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«
data:image/s3,"s3://crabby-images/4d61d/4d61d487866c8cb290837cb7b1cd911c7420eb10" alt=""
Le 25/02/2018 à 14:11, Nikolaus Rath a écrit :
On Feb 25 2018, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Feb 25, 2018 at 11:02 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Feb 22 2018, Serhiy Storchaka <storchaka@gmail.com> wrote:
1. Inner generator expression:
result = [y + g(y) for y in (f(x) for x in range(10))]
[...]
And maybe there are other ways.
I think the syntax recently brough up by Nick is still the most beautiful:
result = [ (f(x) as y) + g(y) for x in range(10)]
Honestly I find this version the most readable while the double for loop is completely weird to me, despite doing python for a living for years. I really hope the later doesn't become a common idiom.
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Michel Desmoulin writes:
Le 25/02/2018 à 14:11, Nikolaus Rath a écrit :
result = [ (f(x) as y) + g(y) for x in range(10)]
Honestly I find this version the most readable while the double for loop is completely weird to me, despite doing python for a living for years.
I find this one less readable because I don't expect name binding syntax to return a value. My brain is nonplussed by the "+". :-)
I really hope the later doesn't become a common idiom.
It already is common, for values of "common" = "some people have been using it where it's useful, but it's not useful all that often". I suppose it's rare because it's a less-than-readable optimization, which is frowned on in Python programming. Somebody counted four or five ways to perform this optimization, including this "double for" that allows the common subexpression optimization to be made explicit in the comprehension. We don't need another, not at the cost of new syntax. If we find that we really want a C-like assignment expression that returns the assigned value[1], I don't have an objection to that. But I don't personally feel a need for it. Footnotes: [1] Yes, I know that technically it's a local name binding, not an assignment. But until the scope of "local" is resolved, it looks, smells, and tastes like an assignment.
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Feb 26, 2018 at 12:11 AM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Feb 25 2018, Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Feb 25, 2018 at 11:02 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Feb 22 2018, Serhiy Storchaka <storchaka@gmail.com> wrote:
1. Inner generator expression:
result = [y + g(y) for y in (f(x) for x in range(10))]
[...]
And maybe there are other ways.
I think the syntax recently brough up by Nick is still the most beautiful:
result = [ (f(x) as y) + g(y) for x in range(10)]
..but I wonder if it is feasible to make the interpreter sufficiently smart to evaluate the first summand before the second.
It already has to. The order of evaluation in Python is well defined, mostly "left to right".
Ah, then the problem is how to evaluate
result = [ y + g(f(x) as y) for x in range(10)]
That ought to raise UnboundLocalError, since y is evaluated before g's arguments. If you're reusing an expression, it isn't too much hassle to demand that the *first* instance of that expression be the one with the 'as' clause. Generally "first" means "leftmost", with rare exceptions (eg it'd be "y if (expr as y) else y" with the assignment in the middle), so that shouldn't bother most people.
But if you allow this in a comprehension, the obvious next step will be "do we allow this in ANY expression?"
Yes, of course. After all, IIRC Nick proposed it to simplify ternary expressions.
The trouble with allowing 'expr as name' in any context is that it's pretty much guaranteed to create confusion in a 'with' statement. Compare: with open(fn) as f: with (open(fn) as f): with contextlib.closing(open(fn)) as f: with (contextlib.closing(open(fn)) as f): Do they all do the same thing? Can you see at a glance which one is different, and *how* it's different? And I'm sure there are other situations where it would be similarly confusing, yet still potentially useful. Does this just get filed under "consenting adults"? Speaking as a C programmer who's quite happy to write code like "while ((var = func()) != sentinel)", I wouldn't object to this coming up in Python; the "as name" syntax has the huge advantage over C's syntax in that you can't accidentally leave off one equals sign and get the wrong behaviour. But I know that a lot of people dislike this at a more fundamental level. If someone wants to push for this, I think it probably needs a PEP - it's a point that comes up periodically. I don't think it's ever had a PEP written about it, but it's a bit hard to search for; maybe someone else knows off hand? ChrisA
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 26 February 2018 at 01:08, Chris Angelico <rosuav@gmail.com> wrote:
Speaking as a C programmer who's quite happy to write code like "while ((var = func()) != sentinel)", I wouldn't object to this coming up in Python; the "as name" syntax has the huge advantage over C's syntax in that you can't accidentally leave off one equals sign and get the wrong behaviour. But I know that a lot of people dislike this at a more fundamental level.
If someone wants to push for this, I think it probably needs a PEP - it's a point that comes up periodically. I don't think it's ever had a PEP written about it, but it's a bit hard to search for; maybe someone else knows off hand?
PEP 3150 is the most closely related PEP we have at the moment (and that only works for simple statements, since it relies on using a trailing block to name the subexpressions). The "(EXPR as NAME)" syntax comes up periodically, especially in the context of while loops (where it would allow a direct translation of C-style embedded assignment idioms). In addition to the potential confusion for "with (EXPR as NAME):" vs "with EXPR as NAME:" (and the similar ambiguity for "except" clauses), some other major questions to be resolved are: * are statement locals in a class namespace turned into attributes on the resulting class? (it would be more useful if they weren't) * are statement locals in a module namespace turned into attributes on the resulting module? (it would be more useful if they weren't) * are statement locals in a function/generator/coroutine namespace kept alive until the entire call terminates? (it would be more useful if they weren't) * do currently defined statement locals appear in calls to locals() or in frame.f_locals? * can lexically nested scopes see names bound this way? (a lot of complex name resolution problems disappear if they can't, plus you get a clearer distinction between these and regular function locals) To be interesting enough to potentially be worthy of syntax, I think name bindings written this way would need to be truly statement local: * reference is released at the end of the statement (whether simple or compound) * no ability to close over them (this goes hand in hand with eagerly dropping the reference) * we play name mangling games and/or use different opcodes to avoid overwriting regular function locals and to avoid appearing in locals() That's still only enough to get the concept into python-ideas territory though (as per the discussion of "(EXPR as .NAME)" in https://mail.python.org/pipermail/python-ideas/2018-February/049002.html) - it's still a *long* way from being fully baked enough to make into a concrete change proposal for 3.8+. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Mon, Feb 26, 2018 at 8:00 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 26 February 2018 at 01:08, Chris Angelico <rosuav@gmail.com> wrote:
Speaking as a C programmer who's quite happy to write code like "while ((var = func()) != sentinel)", I wouldn't object to this coming up in Python; the "as name" syntax has the huge advantage over C's syntax in that you can't accidentally leave off one equals sign and get the wrong behaviour. But I know that a lot of people dislike this at a more fundamental level.
If someone wants to push for this, I think it probably needs a PEP - it's a point that comes up periodically. I don't think it's ever had a PEP written about it, but it's a bit hard to search for; maybe someone else knows off hand?
PEP 3150 is the most closely related PEP we have at the moment (and that only works for simple statements, since it relies on using a trailing block to name the subexpressions).
The "(EXPR as NAME)" syntax comes up periodically, especially in the context of while loops (where it would allow a direct translation of C-style embedded assignment idioms).
In addition to the potential confusion for "with (EXPR as NAME):" vs "with EXPR as NAME:" (and the similar ambiguity for "except" clauses), some other major questions to be resolved are:
* are statement locals in a class namespace turned into attributes on the resulting class? (it would be more useful if they weren't) * are statement locals in a module namespace turned into attributes on the resulting module? (it would be more useful if they weren't) * are statement locals in a function/generator/coroutine namespace kept alive until the entire call terminates? (it would be more useful if they weren't) * do currently defined statement locals appear in calls to locals() or in frame.f_locals? * can lexically nested scopes see names bound this way? (a lot of complex name resolution problems disappear if they can't, plus you get a clearer distinction between these and regular function locals)
To be interesting enough to potentially be worthy of syntax, I think name bindings written this way would need to be truly statement local:
* reference is released at the end of the statement (whether simple or compound) * no ability to close over them (this goes hand in hand with eagerly dropping the reference) * we play name mangling games and/or use different opcodes to avoid overwriting regular function locals and to avoid appearing in locals()
Definitely possible. I wish I still had the POC patch that I put together a while ago that created block-local variables from 'as' bindings in except and with statements, but it got lost in a hard drive crash (because, at the time, I didn't think it was useful for anything more than "haha, isn't that cute"). Will see if I can recreate it. ChrisA
data:image/s3,"s3://crabby-images/552f9/552f93297bac074f42414baecc3ef3063050ba29" alt=""
On 22/02/2018 19:04, Serhiy Storchaka wrote:
Yet one discussion about reusing common subexpressions in comprehensions took place last week on the Python-ideas maillist (see topic "Temporary variables in comprehensions" [1]). The problem is that in comprehension like `[f(x) + g(f(x)) for x in range(10)]` the subexpression `f(x)` is evaluated twice. In normal loop you can introduce a temporary variable for `f(x)`. The OP wanted to add a special syntax for introducing temporary variables in comprehensions. This idea already was discussed multiple times in the past.
There are several ways of resolving this problem with existing syntax.
[snip]
Stephan Houben proposed an idiom which looks similar to new hypothetic syntax:
result = [y + g(y) for x in range(10) for y in [f(x)]]
`for y in [expr]` in a comprehension means just assigning expr to y. I never seen this idiom before, but it can be a good replacement for a hypothetic syntax for assignment in comprehensions. It changes the original comprehension less than other approaches, just adds yet one element in a sequence of for-s and if-s. I think that after using it more widely it will become pretty idiomatic.
I have created a patch that optimizes this idiom, making it as fast as a normal assignment. [2] Yury suggested to ask Guido on the mailing list if he agrees that this language patten is worth optimizing/promoting.
Here's a thought: allow the syntax for VAR = EXPR to define a for-loop that is executed exactly once (both inside and outside comprehensions), i.e. pretty much a synonym for for VAR in [ EXPR ] for VAR in ( EXPR , ) especially if Serhiy's optimisation means that the list/tuple is not actually constructed in the latter. Pros: (1) Stephan Houben's example could be written as result = [y + g(y) for x in range(10) for y = f(x)] which I find more readable. (2) Code such as for i in xrange(10): could be changed on the fly to: for i = 1: I see this as especially useful in debugging, where you want to limit the program execution to a known problematic bit. But it some contexts it could be good style. (3) Preserves the compatibility between a list comprehension and its "expansion" into for-loops. (4) Backward compatible, since it is currently illegal syntax (5) No extra keyword needed (6) It goes some way towards providing the functionality of with VAR as EXPR that has been discussed multiple times. Best wishes Rob Cliffe
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I would like to remind all wannabe language designers that grammar design is not just solving puzzles. It's also about keeping the overall feel of the language readable. I'm getting the idea that none of the proposals discussed so far (whether new syntax or clever use of existing syntax) satisfy that constraint. Sometimes a for-loop is just better. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 27 February 2018 at 05:08, Guido van Rossum <guido@python.org> wrote:
I would like to remind all wannabe language designers that grammar design is not just solving puzzles. It's also about keeping the overall feel of the language readable. I'm getting the idea that none of the proposals discussed so far (whether new syntax or clever use of existing syntax) satisfy that constraint. Sometimes a for-loop is just better.
+1 This is the main reason PEP 3150 (which adds a more limited form of statement local named subexpressions) has spent more time Deferred than it has ever being discussed as an active draft proposal: while naming subexpressions is an occasionally attractive prospect, it's also an addition that has significant potential to change the way various kinds of code is typically written (even more so than something like type hints or f-strings). When even a PEP's author is thinking "I'm not sure this will actually be a net improvement to the language", it's really not a good sign :) Cheers, Nick. P.S. The comprehension-centric variants at least have the virtue of precedent in Haskell's "let" clauses: https://stackoverflow.com/questions/6067839/haskell-let-where-equivalent-wit... -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/552f9/552f93297bac074f42414baecc3ef3063050ba29" alt=""
On 26/02/2018 19:08, Guido van Rossum wrote:
I would like to remind all wannabe language designers that grammar design is not just solving puzzles. It's also about keeping the overall feel of the language readable. I'm getting the idea that none of the proposals discussed so far (whether new syntax or clever use of existing syntax) satisfy that constraint. Sometimes a for-loop is just better.
I don't know if you intended these remarks to include my proposal (to allow "for VAR = EXPR"), as your message was posted only 27 minutes after mine. With respect, I honestly feel that this is a relatively small change that makes the language *more* readable. Feel free, one and all, to tell me why I'm wrong. Best wishes, Rob Cliffe
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Mon, Feb 26, 2018 at 4:30 PM, Rob Cliffe via Python-Dev < python-dev@python.org> wrote:
On 26/02/2018 19:08, Guido van Rossum wrote:
I would like to remind all wannabe language designers that grammar design is not just solving puzzles. It's also about keeping the overall feel of the language readable. I'm getting the idea that none of the proposals discussed so far (whether new syntax or clever use of existing syntax) satisfy that constraint. Sometimes a for-loop is just better.
I don't know if you intended these remarks to include my proposal (to allow "for VAR = EXPR"), as your message was posted only 27 minutes after mine. With respect, I honestly feel that this is a relatively small change that makes the language *more* readable.
Feel free, one and all, to tell me why I'm wrong. Best wishes, Rob Cliffe
I didn't want to single you out, but yes, I did include your proposal. The reason is that for people who are not Python experts there's no obvious reason why `for VAR = EXPR` should mean one thing and `for VAR in EXPR` should mean another. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/69c89/69c89f17a2d4745383b8cc58f8ceebca52d78bb7" alt=""
On Mon, Feb 26, 2018 at 7:51 PM, Guido van Rossum <guido@python.org> wrote: ..
The reason is that for people who are not Python experts there's no obvious reason why `for VAR = EXPR` should mean one thing and `for VAR in EXPR` should mean another.
This would be particularly surprising for people exposed to Julia where these two forms are equivalent: julia> for x = [1,2] println(x); end 1 2 julia> for x in [1,2] println(x); end 1 2
participants (17)
-
Alexander Belopolsky
-
Barry Warsaw
-
Chris Angelico
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Ethan Furman
-
Guido van Rossum
-
Joao S. O. Bueno
-
Michel Desmoulin
-
Nick Coghlan
-
Nikolaus Rath
-
Paul Moore
-
Rob Cliffe
-
Serhiy Storchaka
-
Stefan Behnel
-
Stephen J. Turnbull
-
Steven D'Aprano