
I did a quick review of the stdlib, including the tests, to see which list comprehensions could be replaced with generator expressions. Thought I admit I am biased towards early binding, I ended up looking for cases with the following properties:
* doesn't use the result immediately * contains free variables
This is less that 10% of the cases. However, for each of them, I had to check if the free variables had a chance of being modified after the genexpr. This is pretty rare, admittedly, but there is an example in test/test_random.py (test_avg_std). There are a number of other examples that just don't happen to modify the free variables, by chance; e.g. in optparse.py:
metavar = option.metavar or option.dest.upper() short_opts = [sopt + metavar for sopt in option._short_opts] long_opts = [lopt + "=" + metavar for lopt in option._long_opts]
If we find out later that long_opts actually needs a slightly different value for metavar, it would be tempting to do:
metavar = option.metavar or option.dest.upper() short_opts = [sopt + metavar for sopt in option._short_opts] metavar = option.metavar or option.dest.capitalize() long_opts = [lopt + "=" + metavar for lopt in option._long_opts]
Replace these with genexprs and it doesn't work any more: the 2nd metavar is unexpectedly used in the first genexpr as well. In general I find it strange to have to look if a given variable could be modified much later in the same function to be sure of what a genexpr really means. Early binding is closer to the idea that turning a listcomp into a genexprs should just work if you only iterate once on the result.
Thanks for the analysis. My comment on this: why on earth would you want to replace the perfectly sensible list comprehension with a generator comprehension in this particular case? (Note that the remainder of that function proceeds to do list concatenation of the two lists.) A more general question would be: have you found any examples of list comprehensions that weren't immediately used where there would be a compelling reason to use a generator expression instead? In this example, I can't see any of the touted advantages -- the set of command line options will never be long enough to have to worry about the memory consumption. How about the other examples you found? --Guido van Rossum (home page: http://www.python.org/~guido/)