
Here's yet another implementation for itertoolsmodule.c. (see attachment) I wrote it after the shower (really!) :)
Wow! Thanks. Let's all remember to take or showers and maybe Python will become the cleanest programming language. :)
Raymond, what do you think?
Yes. I recommend taking showers on a regular basis ;-) I'll experiment with groupby() for a few more days and see how it feels. The first impression is that it meets all the criteria for becoming an itertool (iters in, iters out; no unexpected memory use; works well with other tools; not readily constructed from existing tools). At first, the tool seems more special purpose than general purpose. OTOH, it is an excellent solution to a specific class of problems and it makes code much cleaner by avoiding the repeated code block in the non-iterator version.
I would make one change: after looking at another use case, I'd like to change the outer iterator to produce (key, grouper) tuples. This way, you can write things like
totals = {} for key, group in sequence: totals[key] = sum(group)
This is a much stronger formulation than the original. It is clear, succinct, expressive, and less error prone. The implementation would be more complex than the original. If the group is ignored, the outer iterator needs to be smart enough to read through the input iterator until the next group is encountered:
names = ['Tim D', 'Jack D', 'Jack J', 'Barry W', 'Tim P'] firstname = lambda n: n.split()[0] names.sort() unique_first_names = [first for first, _ in groupby(firstname, names)] ['Barry' , 'Jack', 'Tim']
In experimenting with groupby(), I am starting to see a need for a high speed data extractor function. This need is common to several tools that take function arguments (like list.sort(key=)). While extractor functions can be arbitrarily complex, many only fetch a specific attribute or element number. Alex's high-speed curry suggests that it is possible to create a function maker for fast lookups: students.sort(key=extract('grade')) # key=lambda r:r.grade students.sort(key=extract(2)) # key=lambda r:[2] Raymond