Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.

Thu Feb 18 23:27:06 EST 2010

On Feb 18, 10:58 pm, Paul Rubin <no.em... at nospam.invalid> wrote:
> Steve Howell <showel... at yahoo.com> writes:
> >> But frankly, although there's no reason that you _have_ to name the
> >> content at each step, I find it a lot more readable if you do:
>
> >> def print_numbers():
> >>     tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)]
> >>     filtered = [ cube for (square, cube) in tuples if square!=25 and
> >> cube!=64 ]
> >>     for f in filtered:
> >>         print f
>
> > The names you give to the intermediate results here are
> > terse--"tuples" and "filtered"--so your code reads nicely.
>
> But that example makes tuples and filtered into completely expanded
> lists in memory.  I don't know Ruby so I've been wondering whether the
> Ruby code would run as an iterator pipeline that uses constant memory.

I don't know how Ruby works, either.  If it's using constant memory,
switching the Python to generator comprehensions (and getting constant
memory usage) is simply a matter of turning square brackets into
parentheses:

def print_numbers():
    tuples = ((n*n, n*n*n) for n in (1,2,3,4,5,6))
    filtered = ( cube for (square, cube) in tuples if square!=25 and
                 cube!=64 )
    for f in filtered:
        print f

Replace (1,2,3,4,5,6) with xrange(100000000) and memory usage still
stays constant.

Though for this particular example, I prefer a strict looping solution
akin to what Jonathan Gardner had upthread:

for n in (1,2,3,4,5,6):
    square = n*n
    cube = n*n*n
    if square == 25 or cube == 64: continue
    print cube

> > In a more real world example, the intermediate results would be
> > something like this:
>
> >    departments
> >    departments_in_new_york
> >    departments_in_new_york_not_on_bonus_cycle
> >    employees_in_departments_in_new_york_not_on_bonus_cycle
> >    names_of_employee_in_departments_in_new_york_not_on_bonus_cycle

I don't think the assertion that the names would be ridiculously long
is accurate, either.

Something like:

departments = blah
ny_depts = blah(departments)
non_bonus_depts = blah(ny_depts)
non_bonus_employees = blah(non_bonus_depts)
employee_names = blah(non_bonus_employees)

If the code is at all well-structured, it'll be just as obvious from
the context that each list/generator/whatever is building from the
previous one as it is in the anonymous block case.