I admit a hypothetical itertools.grouping that returned incrementally built dictionaries doesn't fill any simple need I have often encountered.  I can be hand-wavy about "stateful bucketing of streams" and looking at windowing/tails, but I don't have a clean and simple example where I need this.  The "run to exhaustion" interface has more obvious uses (albeit, they *must* be technically a subset of the incremental ones).

I think I will also concede that in incrementally built and yielded dictionary isn't *really* in the spirit of itertools either.  I suppose tee() can grow unboundedly if only one tine is utilized... but in general, itertools is meant to provide iterators that keep memory usage limited to a few elements in memory at a time (yes, groupby, takewhile, or dropwhile have pathological cases that could be unbounded... but usually they're not).

So maybe we really do need a dicttools or mappingtools module, with this as the first function to put inside it.

... but I STILL like a new collections.Grouping (or collections.Grouper) the best.  It might overcome Guido's reluctance... and what goes there is really delegated by him, not his own baby.

On Tue, Jul 3, 2018 at 12:19 PM Chris Barker via Python-ideas <python-ideas@python.org> wrote:
On Tue, Jul 3, 2018 at 8:24 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jul 03, 2018 at 09:23:07AM -0400, David Mertz wrote:
> My problem with the second idea is that *I* find it very wrong to have
> something in itertools that does not return an iterator.  It wrecks the
> combinatorial algebra of the module.

hmm -- that seems to be a pretty pedantic approach -- practicality beats purity, after all :-)

I think we should first decide if a grouping() function is a useful addition to the standard library (after all:  "not every two line function needs to in the stdlib"), and f so, then we can find a home for it.

personally, I'm wondering if a "dicttools" or something module would make sense -- I imagine there are all sorts of other handy utilities for working with dicts that could go there. (though, yeah, we'd want to actually have a handful of these before creating a new module :-) )

> That said, it's easy to fix... and I believe independently useful.  Just
> make grouping() a generator function rather than a plain function.  This
> lets us get an incremental grouping of an iterable.

We already have something which lazily groups an iterable, returning
groups as they are seen: groupby.

What makes grouping() different from groupby() is that it accumulates
ALL of the subgroups rather than just consecutive subgroupings.

well, yeah, but it wont actually get you those until you exhaust the iterator -- so while it's different than itertools.groupby, it is different than itertools.groupby(sorted(iterable))?

In short, this wouldn't really solve the problems that itertools.groupby has for this sort of task -- so what's the point?

 > As for where it belongs, perhaps the collections module is the least worst fit.

That depends some on whether we go with a simple function, in which case collections is a pretty bad fit (but maybe still the least worse).

Personally I still like the idea of having this be special type of dict, rather than "just a function" -- and then it's really obvious where to put it :-)



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Python-ideas mailing list
Code of Conduct: http://python.org/psf/codeofconduct/

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.