[Python-Dev] Re: "groupby" iterator
Guido van Rossum
guido at python.org
Fri Dec 5 10:22:42 EST 2003
> Greg Ewing's proposal of a "given" keyword (x.score given x) got me
> thinking. I figured I would play around a bit and try to come up with
> the most readable version of the original "groupby" idea (for which I
> could imagine *some* implementation):
>
> for group in sequence.groups(using item.score - item.penalty):
> ...do stuff with group
>
> Having written this down, it seems to me the most readable so far. The
> keyword "using" creates a new scope, within which "item" is bound to the
> arg (or *args?) passed in. I don't know about you all, but the thing I
> like least about lambda is having to mention 'x' twice:
>
> lambda x: x.score
>
> Why have the programmer bind a custom name to an object we're going to
> then use 'anonymously' anyway? I understand its historical necessity,
> but it's always struck me as more complex than the concept being
> implemented. Ideally, we should be able to reference the passed-in
> objects without having to invent names for them.
Huh? How can you reference something without having a name for it?
Are you proposing to add pronouns to Python?
> Now, consider multi-arg lambdas such as:
>
> sequence.sort(lambda x, y: cmp(x[0], y[0]))
>
> In these cases, we wish to apply the same operation to each item (that
> is, we calculate x[0] and y[0]). If we bind "item" to each argument *in
> turn*, we save a lot of syntax. The above might then be written as:
> sequence.sort(using cmp(item[0])) # Hard to implement.
>
> or:
> sequence.sort(cmp(using item[0])) # Easier but ugly. Meh.
>
> or:
> sequence.sort(cmp using item[0]) # Oooh. Nice. :)
>
> or:
> # might we assume cmp(), since sort does...?
> sequence.sort(using item[0])
>
> I like #3, since cmp is explicit but doesn't use cmp(), which looks too
> much like a call. Given (cmp using item[0]), the "using block" would
> look at the arguments supplied by sort(), call __getitem__[0] for each,
> and pass those values in order into cmp, returning the result.
There are lots of situations where the comparison lambda is just a bit
more complex than this, for example:
lambda x, y: cmp(x[0], y[0]) or cmp(x[1], y[1])
And how would you spell lambda x, y: x+y? "+ using item"??? That
becomes a syntactical nightmare. (Or what about lambda x, y: 2*(x+y)?)
I also think you are cheating by using sort() as the example -- other
examples of multi-argument lambdas aren't necessarily so uniform in
the arguments.
> The "item" keyword functions similarly to Guido's Voodoo.foo() proposal,
> now that I think about it. There's no reason it couldn't grow some early
> binding, either, as suggested, although multiple operations would become
> unwieldy. How would you early-bind this?
>
> sequence.groups(using divmod(item, 4)[1])
>
> ...except perhaps by using multiply-nested scopes to bind the "1" and
> then the "4"?
I see all sorts of problems with this, but early-binding "1" and "4"
aren't amongst them -- early binding only applies to free variables,
not to constants.
> Hmm. It would have to do some fancy dancing to get everything in the
> right order. Too much like reinventing Python to think about at the
> moment. :) The point is, passing the "item" instance through such a
> scheme should be the easy part.
I've read this whole post twice, and I still don't understand what
you're really proposing (or how it could ever work given the rest of
Python), so I think it's probably not a good idea from a readability
perspective...
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-Dev
mailing list