Introducing where clauses

Hello. Reading discussion on python-ideas about "Accessing the result of comprehension's expression from the conditional", I've came to the idea of where clauses, similar to Haskell's. This solves the problem of recalculating of value multiple times. For example, in the following expression: [(f(x), f(x)) for x in some_iterable if f(x) < 2] value f(y) calculates three times -- It is a problem if function f takes much time to compute its value or has side effects. If we would have where clause, we can rewrite expression above with: [(y, y) for x in some_iterable if y < 2 where y = f(x)] I think it is really useful. We can also expand this idea to lambdas or maybe to introducing arbitrary scoping blocks of code. Other thoughts: - Can we use where clauses in lambda definition to allow some kind of multi-line lamdba's?

Andrey Popp wrote:
Or you could just bite the bullet and write a custom generator: def g(iterable): for x in iterable: y = f(x) if y < 2: yield (y, y) Give it a meaningful name and docstring and it can even be self-documenting. Lambdas, comprehensions and expressions in general all have limits - usually deliberate ones. When one runs up against those limits it is a hint that it is time to switch to using multiple statements (typically factored out into a function that can be substituted for the original inline expression) But then, I'll freely confess to not really understanding the apparently common obsession with wanting to be able to do everything as an expression. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, 22 Jun 2009 08:40:05 pm Nick Coghlan wrote:
+1
Some things are conceptually a single operation, and those things are good to write as a single expression. Before we had sorted(), it was uncomfortable to write: L.sort() return L when you wanted a sorted list, because "return a sorted list" is conceptually a single operation, even if sorting is non-trivial. The solution to this was to write a helper function, which was made obsolete when sorted() became a built-in. The OP's suggestion: [(f(x), f(x)) for x in some_iterable if f(x) < 2] is not conceptually a single operation, because producing a list of two-tuples containing some value y repeated but only if y is less than 2 is not conceptually simple. If it were, it would be easy to describe the operation with one or two words, instead of the fourteen it took me. I still believe that the right way to solve this is with a pipeline of simple operations: map(lambda obj: (obj, obj), filter(lambda y: y < 2, map(f, some_iterable))) Re-write with temporary variables, itertools, and generator expressions as preferred. -- Steven D'Aprano

I think that producing a list of tuples (that is conceptually image of mapping from some_iterable set) is basic operation. But... ok, now we have three ways to produce list below: [(f(x), f(x)) for x in some_iterable if f(x) < 2] 1)
2)
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
3)
And none of them does not look as obvious as [(f(x), f(x)) for x in some_iterable if f(x) < 2] , doesn't it? While proposed variant with where-clause [(y, y) for x in some_iterable if y < 2 where y = f(x)] looks more naturally than three suggested variants. I give strong emphasis on that fact, that where-clause is only syntactic sugar, suggested for better readability.

On Mon, Jun 22, 2009 at 8:46 AM, Andrey Popp<8mayday@gmail.com> wrote:
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
looks more naturally than three suggested variants.
Only if you are already thinking about SQL. An "as" would be almost as good as a "where", and would better match the way python has evolved so far. But that still doesn't answer the real question. In isolation, I would see either as a real (if perhaps small and isolated) improvement to readability. The catch is that the improvement must be dramatic enough to justify the cost -- which is extra complexity to the language as a whole. That isn't a cost you see as easily, because you're thinking about comprehensions; it is instead a small tax paid by people doing regular function calls and if branching and for loops. It isn't a big tax, but it is cumulative, and python has worked hard to minimize it. -jJ

On Mon, Jun 22, 2009 at 4:59 PM, Jim Jewett<jimjjewett@gmail.com> wrote:
I thinking mostly about Haskell. "as" or "where"- for me, there is no difference.
I am not about list comprehension only, there are other cases for where-clause, for example lambdas: f = lambda x: (x, y) if x > 0 else (x, 0) where y = g(x) Or maybe more. List comprehension are only small example, that is show the one of use-cases for where-clause.

Maybe "let" would be less ambiguous (because what "where" means in SQL). Also let is used in JavaScript (I consider JavaScript very similar to Python and Ruby). -panzi

Andrey Popp wrote:
Yuck. This reminds me of why I gave up on the Haskell tutorial I was working through. The reading of this line keeps bouncing back and forth. "OK, function f, passing in x, returning x, y… Wait? What's y? Is that from an external scope? Anyway, here's an if-else clause, and oh, there's the y! It's the same as g(x). OK, so where all did they use y? Hmm, lets see, looks like just the one spot…" This would be much easier to grasp as a function: def f(x): y = g(x) if y > 0: return x, y else: return x, 0 This version makes the parallelism between the two return values much more clear: "Oh, OK, it's always going to return x as the first in the tuple, and the second value will be either g(x) or 0, whichever is greater." We might even re-write it as def f(x): y = g(x) r = y if y > 0 else 0 return x, r to make it shorter. -- Carl

Personally I like the idea of the "where" clause. It works well in Haskell since it is tied closely to how functions are often defined in mathematics. e.g. Area of a circle = pi*r**2 where pi is 3.14159.... and r is the radius of the circle In Haskell, it makes for concise function definitions. IIRC defining functions without the "where" clause in Haskell is a Hassle with a capital "H". However Python suffers from no such problem. Though I like the idea as a concept, I see it as syntactic sugar for Python that is essentially a solution in search of a problem. On Tue, Jun 23, 2009 at 3:11 AM, Georg Brandl<g.brandl@gmx.net> wrote:
-- Gerald Britton

Andrey Popp wrote:
Taking in isolation, there is no reason to produce a list rather than an iterator.
This does more than the above because g is reusable both with the same iterable and other iterables.
2)
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
Though I would probably write the reusable generator I might write this as ygen = (f(x) for x in some_iterable) # or map(f, some_iterable) if f is an existing function ypairs = ((y, y) for y in ygen if y < 2) There are really two ideas: map f to some_iterable make pairs conditionally. There should be no shame in putting each in a separate statement. Nested generators do not really turn this into 'two' iterations, as iterating with ypairs will run the implied loop in synchrony.
'natural' is in the eye of the beholder
I give strong emphasis on that fact, that where-clause is only syntactic sugar, suggested for better readability.
Too much sugar = stomach ache ;-). Terry Jan Reedy

Terry Reedy <tjreedy@udel.edu> writes:
Indeed. My only point with the above example is that, for those who *are* desperate to have this all as a single readable statement, the existing syntax supports it nicely. Since the existing syntax already easily supports every case where the proposed syntax would be used, the burden is on those proposing the new syntax to demonstrate benefits significant enough to outweigh the costs of adding bulk to the language. AFAICT, no convincingly significant benefit has been presented to add syntax for this case.
+1 QOTW -- \ “Holy knit one purl two, Batman!” —Robin | `\ | _o__) | Ben Finney

Andrey Popp <8mayday@gmail.com> writes:
New syntax isn't necessary to solve the above stated problem. For example, the following existing syntax is also a solution:: [(y, y) for y in (f(x) for x in some_iterable) if y < 2] For the proposed new syntax to be accepted, it would need to be somehow significantly superior to the existing syntax. Can you demonstrate how it's superior? -- \ “I took a course in speed waiting. Now I can wait an hour in | `\ only ten minutes.” —Steven Wright | _o__) | Ben Finney

On Mon, Jun 22, 2009 at 3:01 PM, Ben Finney<ben+python@benfinney.id.au> wrote:
Your statement: [(y, y) for y in (f(x) for x in some_iterable) if y < 2] means producing a list by iterating over the generator, which iterates over the some_iterable, it is correct in algorithmic way, but not in semantic. I did not made statement about impossibility of making this kind of things in python right now, without where-clauses. It is only syntactic sugar, but very expressive, I think.

Andrey Popp writes:
That's only true if you think of "iterable" as a generalized (ie, possibly infinite) sequence. However, you can also think of this, not as "iterating over a sequence of values created by iterating over another sequence of values," but rather "iterating 'application' over a sequence of filters." Then the "for" clause functions as a "where". The first "for" in Ben's expression is interpreted as "where y = f(x)", and the second "for" is interpreted as "where x = next(some_iterable)". This is not fully general; it only works at all in a comprehension context, and I'm not entirely sure it works perfectly here. But it troubles me that we already have a way to say "where" at the upper levels of nesting, and you want to introduce a new "where" that is only useful at upper levels of nesting. Another way to put this is that because these are iterables (streams) rather than sequences (ordered (finite) sets), it doesn't make sense to talk about "double interation." There are multiple levels of iterable here, but in the end there's only one iterative process, and all the levels share the same "iteration".

Andrey Popp wrote:
Or you could just bite the bullet and write a custom generator: def g(iterable): for x in iterable: y = f(x) if y < 2: yield (y, y) Give it a meaningful name and docstring and it can even be self-documenting. Lambdas, comprehensions and expressions in general all have limits - usually deliberate ones. When one runs up against those limits it is a hint that it is time to switch to using multiple statements (typically factored out into a function that can be substituted for the original inline expression) But then, I'll freely confess to not really understanding the apparently common obsession with wanting to be able to do everything as an expression. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, 22 Jun 2009 08:40:05 pm Nick Coghlan wrote:
+1
Some things are conceptually a single operation, and those things are good to write as a single expression. Before we had sorted(), it was uncomfortable to write: L.sort() return L when you wanted a sorted list, because "return a sorted list" is conceptually a single operation, even if sorting is non-trivial. The solution to this was to write a helper function, which was made obsolete when sorted() became a built-in. The OP's suggestion: [(f(x), f(x)) for x in some_iterable if f(x) < 2] is not conceptually a single operation, because producing a list of two-tuples containing some value y repeated but only if y is less than 2 is not conceptually simple. If it were, it would be easy to describe the operation with one or two words, instead of the fourteen it took me. I still believe that the right way to solve this is with a pipeline of simple operations: map(lambda obj: (obj, obj), filter(lambda y: y < 2, map(f, some_iterable))) Re-write with temporary variables, itertools, and generator expressions as preferred. -- Steven D'Aprano

I think that producing a list of tuples (that is conceptually image of mapping from some_iterable set) is basic operation. But... ok, now we have three ways to produce list below: [(f(x), f(x)) for x in some_iterable if f(x) < 2] 1)
2)
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
3)
And none of them does not look as obvious as [(f(x), f(x)) for x in some_iterable if f(x) < 2] , doesn't it? While proposed variant with where-clause [(y, y) for x in some_iterable if y < 2 where y = f(x)] looks more naturally than three suggested variants. I give strong emphasis on that fact, that where-clause is only syntactic sugar, suggested for better readability.

On Mon, Jun 22, 2009 at 8:46 AM, Andrey Popp<8mayday@gmail.com> wrote:
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
looks more naturally than three suggested variants.
Only if you are already thinking about SQL. An "as" would be almost as good as a "where", and would better match the way python has evolved so far. But that still doesn't answer the real question. In isolation, I would see either as a real (if perhaps small and isolated) improvement to readability. The catch is that the improvement must be dramatic enough to justify the cost -- which is extra complexity to the language as a whole. That isn't a cost you see as easily, because you're thinking about comprehensions; it is instead a small tax paid by people doing regular function calls and if branching and for loops. It isn't a big tax, but it is cumulative, and python has worked hard to minimize it. -jJ

On Mon, Jun 22, 2009 at 4:59 PM, Jim Jewett<jimjjewett@gmail.com> wrote:
I thinking mostly about Haskell. "as" or "where"- for me, there is no difference.
I am not about list comprehension only, there are other cases for where-clause, for example lambdas: f = lambda x: (x, y) if x > 0 else (x, 0) where y = g(x) Or maybe more. List comprehension are only small example, that is show the one of use-cases for where-clause.

Maybe "let" would be less ambiguous (because what "where" means in SQL). Also let is used in JavaScript (I consider JavaScript very similar to Python and Ruby). -panzi

Andrey Popp wrote:
Yuck. This reminds me of why I gave up on the Haskell tutorial I was working through. The reading of this line keeps bouncing back and forth. "OK, function f, passing in x, returning x, y… Wait? What's y? Is that from an external scope? Anyway, here's an if-else clause, and oh, there's the y! It's the same as g(x). OK, so where all did they use y? Hmm, lets see, looks like just the one spot…" This would be much easier to grasp as a function: def f(x): y = g(x) if y > 0: return x, y else: return x, 0 This version makes the parallelism between the two return values much more clear: "Oh, OK, it's always going to return x as the first in the tuple, and the second value will be either g(x) or 0, whichever is greater." We might even re-write it as def f(x): y = g(x) r = y if y > 0 else 0 return x, r to make it shorter. -- Carl

Personally I like the idea of the "where" clause. It works well in Haskell since it is tied closely to how functions are often defined in mathematics. e.g. Area of a circle = pi*r**2 where pi is 3.14159.... and r is the radius of the circle In Haskell, it makes for concise function definitions. IIRC defining functions without the "where" clause in Haskell is a Hassle with a capital "H". However Python suffers from no such problem. Though I like the idea as a concept, I see it as syntactic sugar for Python that is essentially a solution in search of a problem. On Tue, Jun 23, 2009 at 3:11 AM, Georg Brandl<g.brandl@gmx.net> wrote:
-- Gerald Britton

Andrey Popp wrote:
Taking in isolation, there is no reason to produce a list rather than an iterator.
This does more than the above because g is reusable both with the same iterable and other iterables.
2)
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
Though I would probably write the reusable generator I might write this as ygen = (f(x) for x in some_iterable) # or map(f, some_iterable) if f is an existing function ypairs = ((y, y) for y in ygen if y < 2) There are really two ideas: map f to some_iterable make pairs conditionally. There should be no shame in putting each in a separate statement. Nested generators do not really turn this into 'two' iterations, as iterating with ypairs will run the implied loop in synchrony.
'natural' is in the eye of the beholder
I give strong emphasis on that fact, that where-clause is only syntactic sugar, suggested for better readability.
Too much sugar = stomach ache ;-). Terry Jan Reedy

Terry Reedy <tjreedy@udel.edu> writes:
Indeed. My only point with the above example is that, for those who *are* desperate to have this all as a single readable statement, the existing syntax supports it nicely. Since the existing syntax already easily supports every case where the proposed syntax would be used, the burden is on those proposing the new syntax to demonstrate benefits significant enough to outweigh the costs of adding bulk to the language. AFAICT, no convincingly significant benefit has been presented to add syntax for this case.
+1 QOTW -- \ “Holy knit one purl two, Batman!” —Robin | `\ | _o__) | Ben Finney

Andrey Popp <8mayday@gmail.com> writes:
New syntax isn't necessary to solve the above stated problem. For example, the following existing syntax is also a solution:: [(y, y) for y in (f(x) for x in some_iterable) if y < 2] For the proposed new syntax to be accepted, it would need to be somehow significantly superior to the existing syntax. Can you demonstrate how it's superior? -- \ “I took a course in speed waiting. Now I can wait an hour in | `\ only ten minutes.” —Steven Wright | _o__) | Ben Finney

On Mon, Jun 22, 2009 at 3:01 PM, Ben Finney<ben+python@benfinney.id.au> wrote:
Your statement: [(y, y) for y in (f(x) for x in some_iterable) if y < 2] means producing a list by iterating over the generator, which iterates over the some_iterable, it is correct in algorithmic way, but not in semantic. I did not made statement about impossibility of making this kind of things in python right now, without where-clauses. It is only syntactic sugar, but very expressive, I think.

Andrey Popp writes:
That's only true if you think of "iterable" as a generalized (ie, possibly infinite) sequence. However, you can also think of this, not as "iterating over a sequence of values created by iterating over another sequence of values," but rather "iterating 'application' over a sequence of filters." Then the "for" clause functions as a "where". The first "for" in Ben's expression is interpreted as "where y = f(x)", and the second "for" is interpreted as "where x = next(some_iterable)". This is not fully general; it only works at all in a comprehension context, and I'm not entirely sure it works perfectly here. But it troubles me that we already have a way to say "where" at the upper levels of nesting, and you want to introduce a new "where" that is only useful at upper levels of nesting. Another way to put this is that because these are iterables (streams) rather than sequences (ordered (finite) sets), it doesn't make sense to talk about "double interation." There are multiple levels of iterable here, but in the end there's only one iterative process, and all the levels share the same "iteration".
participants (11)
-
Andrey Popp
-
Ben Finney
-
Carl Johnson
-
Georg Brandl
-
Gerald Britton
-
Jim Jewett
-
Mathias Panzenböck
-
Nick Coghlan
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy