Introducing where clauses
Hello.
Reading discussion on pythonideas about "Accessing the result of comprehension's expression from the conditional", I've came to the idea of where clauses, similar to Haskell's.
This solves the problem of recalculating of value multiple times. For example, in the following expression:
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
value f(y) calculates three times  It is a problem if function f takes much time to compute its value or has side effects. If we would have where clause, we can rewrite expression above with:
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
I think it is really useful. We can also expand this idea to lambdas or maybe to introducing arbitrary scoping blocks of code.
Other thoughts:  Can we use where clauses in lambda definition to allow some kind of multiline lamdba's?
Andrey Popp wrote:
value f(y) calculates three times  It is a problem if function f takes much time to compute its value or has side effects. If we would have where clause, we can rewrite expression above with:
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
I think it is really useful. We can also expand this idea to lambdas or maybe to introducing arbitrary scoping blocks of code.
Or you could just bite the bullet and write a custom generator:
def g(iterable): for x in iterable: y = f(x) if y < 2: yield (y, y)
Give it a meaningful name and docstring and it can even be selfdocumenting.
Lambdas, comprehensions and expressions in general all have limits  usually deliberate ones. When one runs up against those limits it is a hint that it is time to switch to using multiple statements (typically factored out into a function that can be substituted for the original inline expression)
But then, I'll freely confess to not really understanding the apparently common obsession with wanting to be able to do everything as an expression.
Cheers, Nick.
On Mon, 22 Jun 2009 08:40:05 pm Nick Coghlan wrote:
Lambdas, comprehensions and expressions in general all have limits  usually deliberate ones. When one runs up against those limits it is a hint that it is time to switch to using multiple statements (typically factored out into a function that can be substituted for the original inline expression)
+1
But then, I'll freely confess to not really understanding the apparently common obsession with wanting to be able to do everything as an expression.
Some things are conceptually a single operation, and those things are good to write as a single expression. Before we had sorted(), it was uncomfortable to write:
L.sort() return L
when you wanted a sorted list, because "return a sorted list" is conceptually a single operation, even if sorting is nontrivial. The solution to this was to write a helper function, which was made obsolete when sorted() became a builtin.
The OP's suggestion:
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
is not conceptually a single operation, because producing a list of twotuples containing some value y repeated but only if y is less than 2 is not conceptually simple. If it were, it would be easy to describe the operation with one or two words, instead of the fourteen it took me.
I still believe that the right way to solve this is with a pipeline of simple operations:
map(lambda obj: (obj, obj), filter(lambda y: y < 2, map(f, some_iterable)))
Rewrite with temporary variables, itertools, and generator expressions as preferred.
I think that producing a list of tuples (that is conceptually image of mapping from some_iterable set) is basic operation. But... ok, now we have three ways to produce list below:
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
1)
def g(iterable): for x in iterable: y = f(x) if y < 2: yield (y, y)
2)
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
3)
map(lambda obj: (obj, obj), filter(lambda y: y < 2, map(f, some_iterable)))
And none of them does not look as obvious as
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
, doesn't it? While proposed variant with whereclause
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
looks more naturally than three suggested variants.
I give strong emphasis on that fact, that whereclause is only syntactic sugar, suggested for better readability.
On Mon, Jun 22, 2009 at 8:46 AM, Andrey Popp8mayday@gmail.com wrote:
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
looks more naturally than three suggested variants.
Only if you are already thinking about SQL.
An "as" would be almost as good as a "where", and would better match the way python has evolved so far.
But that still doesn't answer the real question. In isolation, I would see either as a real (if perhaps small and isolated) improvement to readability.
The catch is that the improvement must be dramatic enough to justify the cost  which is extra complexity to the language as a whole. That isn't a cost you see as easily, because you're thinking about comprehensions; it is instead a small tax paid by people doing regular function calls and if branching and for loops. It isn't a big tax, but it is cumulative, and python has worked hard to minimize it.
jJ
On Mon, Jun 22, 2009 at 4:59 PM, Jim Jewettjimjjewett@gmail.com wrote:
On Mon, Jun 22, 2009 at 8:46 AM, Andrey Popp8mayday@gmail.com wrote:
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
looks more naturally than three suggested variants.
Only if you are already thinking about SQL.
An "as" would be almost as good as a "where", and would better match the way python has evolved so far.
I thinking mostly about Haskell. "as" or "where" for me, there is no difference.
But that still doesn't answer the real question. In isolation, I would see either as a real (if perhaps small and isolated) improvement to readability.
The catch is that the improvement must be dramatic enough to justify the cost  which is extra complexity to the language as a whole. That isn't a cost you see as easily, because you're thinking about comprehensions; it is instead a small tax paid by people doing regular function calls and if branching and for loops. It isn't a big tax, but it is cumulative, and python has worked hard to minimize it.
I am not about list comprehension only, there are other cases for whereclause, for example lambdas:
f = lambda x: (x, y) if x > 0 else (x, 0) where y = g(x)
Or maybe more. List comprehension are only small example, that is show the one of usecases for whereclause.
Maybe "let" would be less ambiguous (because what "where" means in SQL). Also let is used in JavaScript (I consider JavaScript very similar to Python and Ruby).
panzi
Andrey Popp wrote:
I am not about list comprehension only, there are other cases for whereclause, for example lambdas:
f = lambda x: (x, y) if x > 0 else (x, 0) where y = g(x)
Yuck. This reminds me of why I gave up on the Haskell tutorial I was working through. The reading of this line keeps bouncing back and forth. "OK, function f, passing in x, returning x, y… Wait? What's y? Is that from an external scope? Anyway, here's an ifelse clause, and oh, there's the y! It's the same as g(x). OK, so where all did they use y? Hmm, lets see, looks like just the one spot…" This would be much easier to grasp as a function:
def f(x): y = g(x) if y > 0: return x, y else: return x, 0
This version makes the parallelism between the two return values much more clear: "Oh, OK, it's always going to return x as the first in the tuple, and the second value will be either g(x) or 0, whichever is greater." We might even rewrite it as
def f(x): y = g(x) r = y if y > 0 else 0 return x, r
to make it shorter.
 Carl
Carl Johnson schrieb:
Andrey Popp wrote:
I am not about list comprehension only, there are other cases for whereclause, for example lambdas:
f = lambda x: (x, y) if x > 0 else (x, 0) where y = g(x)
Yuck. This reminds me of why I gave up on the Haskell tutorial I was working through. The reading of this line keeps bouncing back and forth. "OK, function f, passing in x, returning x, y… Wait? What's y? Is that from an external scope? Anyway, here's an ifelse clause, and oh, there's the y! It's the same as g(x). OK, so where all did they use y? Hmm, lets see, looks like just the one spot…"
I know what you mean :) I think "where" is best used where you all but know exactly what the wherebound name refers to, but have to spell it out for the stupid computer somewhere...
Georg
Personally I like the idea of the "where" clause. It works well in Haskell since it is tied closely to how functions are often defined in mathematics. e.g.
Area of a circle = pi*r**2 where pi is 3.14159.... and r is the radius of the circle
In Haskell, it makes for concise function definitions. IIRC defining functions without the "where" clause in Haskell is a Hassle with a capital "H". However Python suffers from no such problem. Though I like the idea as a concept, I see it as syntactic sugar for Python that is essentially a solution in search of a problem.
On Tue, Jun 23, 2009 at 3:11 AM, Georg Brandlg.brandl@gmx.net wrote:
Carl Johnson schrieb:
Andrey Popp wrote:
I am not about list comprehension only, there are other cases for whereclause, for example lambdas:
f = lambda x: (x, y) if x > 0 else (x, 0) where y = g(x)
Yuck. This reminds me of why I gave up on the Haskell tutorial I was working through. The reading of this line keeps bouncing back and forth. "OK, function f, passing in x, returning x, y… Wait? What's y? Is that from an external scope? Anyway, here's an ifelse clause, and oh, there's the y! It's the same as g(x). OK, so where all did they use y? Hmm, lets see, looks like just the one spot…"
I know what you mean :) I think "where" is best used where you all but know exactly what the wherebound name refers to, but have to spell it out for the stupid computer somewhere...
Georg
Pythonideas mailing list Pythonideas@python.org http://mail.python.org/mailman/listinfo/pythonideas
Andrey Popp wrote:
I think that producing a list of tuples (that is conceptually image of mapping from some_iterable set) is basic operation. But... ok, now we have three ways to produce list below:
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
Taking in isolation, there is no reason to produce a list rather than an iterator.
def g(iterable): for x in iterable: y = f(x) if y < 2: yield (y, y)
This does more than the above because g is reusable both with the same iterable and other iterables.
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
Though I would probably write the reusable generator I might write this as
ygen = (f(x) for x in some_iterable) # or map(f, some_iterable) if f is an existing function ypairs = ((y, y) for y in ygen if y < 2)
There are really two ideas: map f to some_iterable make pairs conditionally.
There should be no shame in putting each in a separate statement.
Nested generators do not really turn this into 'two' iterations, as iterating with ypairs will run the implied loop in synchrony.
map(lambda obj: (obj, obj), filter(lambda y: y < 2, map(f, some_iterable)))
And none of them does not look as obvious as
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
, doesn't it? While proposed variant with whereclause
[(y, y) for x in some_iterable if y < 2 where y = f(x)]
looks more naturally than three suggested variants.
'natural' is in the eye of the beholder
I give strong emphasis on that fact, that whereclause is only syntactic sugar, suggested for better readability.
Too much sugar = stomach ache ;).
Terry Jan Reedy
Terry Reedy tjreedy@udel.edu writes:
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
Though I would probably write the reusable generator I might write this as
ygen = (f(x) for x in some_iterable) # or map(f, some_iterable) if f is an existing function ypairs = ((y, y) for y in ygen if y < 2)
There are really two ideas: map f to some_iterable make pairs conditionally.
There should be no shame in putting each in a separate statement.
Indeed. My only point with the above example is that, for those who *are* desperate to have this all as a single readable statement, the existing syntax supports it nicely.
Since the existing syntax already easily supports every case where the proposed syntax would be used, the burden is on those proposing the new syntax to demonstrate benefits significant enough to outweigh the costs of adding bulk to the language.
AFAICT, no convincingly significant benefit has been presented to add syntax for this case.
I give strong emphasis on that fact, that whereclause is only syntactic sugar, suggested for better readability.
Too much sugar = stomach ache ;).
+1 QOTW
Andrey Popp 8mayday@gmail.com writes:
Reading discussion on pythonideas about "Accessing the result of comprehension's expression from the conditional", I've came to the idea of where clauses, similar to Haskell's.
This solves the problem of recalculating of value multiple times. For example, in the following expression:
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
New syntax isn't necessary to solve the above stated problem. For example, the following existing syntax is also a solution::
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
For the proposed new syntax to be accepted, it would need to be somehow significantly superior to the existing syntax. Can you demonstrate how it's superior?
On Mon, Jun 22, 2009 at 3:01 PM, Ben Finneyben+python@benfinney.id.au wrote:
Andrey Popp 8mayday@gmail.com writes:
Reading discussion on pythonideas about "Accessing the result of comprehension's expression from the conditional", I've came to the idea of where clauses, similar to Haskell's.
This solves the problem of recalculating of value multiple times. For example, in the following expression:
[(f(x), f(x)) for x in some_iterable if f(x) < 2]
New syntax isn't necessary to solve the above stated problem. For example, the following existing syntax is also a solution::
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
For the proposed new syntax to be accepted, it would need to be somehow significantly superior to the existing syntax. Can you demonstrate how it's superior?
Your statement:
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
means producing a list by iterating over the generator, which iterates over the some_iterable, it is correct in algorithmic way, but not in semantic. I did not made statement about impossibility of making this kind of things in python right now, without whereclauses. It is only syntactic sugar, but very expressive, I think.
Andrey Popp writes:
Your statement:
[(y, y) for y in (f(x) for x in some_iterable) if y < 2]
means producing a list by iterating over the generator, which iterates over the some_iterable, it is correct in algorithmic way, but not in semantic.
That's only true if you think of "iterable" as a generalized (ie, possibly infinite) sequence. However, you can also think of this, not as "iterating over a sequence of values created by iterating over another sequence of values," but rather "iterating 'application' over a sequence of filters." Then the "for" clause functions as a "where". The first "for" in Ben's expression is interpreted as "where y = f(x)", and the second "for" is interpreted as "where x = next(some_iterable)".
This is not fully general; it only works at all in a comprehension context, and I'm not entirely sure it works perfectly here. But it troubles me that we already have a way to say "where" at the upper levels of nesting, and you want to introduce a new "where" that is only useful at upper levels of nesting.
Another way to put this is that because these are iterables (streams) rather than sequences (ordered (finite) sets), it doesn't make sense to talk about "double interation." There are multiple levels of iterable here, but in the end there's only one iterative process, and all the levels share the same "iteration".
participants (11)

Andrey Popp

Ben Finney

Carl Johnson

Georg Brandl

Gerald Britton

Jim Jewett

Mathias Panzenböck

Nick Coghlan

Stephen J. Turnbull

Steven D'Aprano

Terry Reedy