
I admit a hypothetical itertools.grouping that returned incrementally built dictionaries doesn't fill any simple need I have often encountered. I can be hand-wavy about "stateful bucketing of streams" and looking at windowing/tails, but I don't have a clean and simple example where I need this. The "run to exhaustion" interface has more obvious uses (albeit, they *must* be technically a subset of the incremental ones).
I think I will also concede that in incrementally built and yielded dictionary isn't *really* in the spirit of itertools either. I suppose tee() can grow unboundedly if only one tine is utilized... but in general, itertools is meant to provide iterators that keep memory usage limited to a few elements in memory at a time (yes, groupby, takewhile, or dropwhile have pathological cases that could be unbounded... but usually they're not).
So maybe we really do need a dicttools or mappingtools module, with this as the first function to put inside it.
... but I STILL like a new collections.Grouping (or collections.Grouper) the best. It might overcome Guido's reluctance... and what goes there is really delegated by him, not his own baby.
On Tue, Jul 3, 2018 at 12:19 PM Chris Barker via Python-ideas < python-ideas@python.org> wrote:
On Tue, Jul 3, 2018 at 8:24 AM, Steven D'Aprano steve@pearwood.info wrote:
On Tue, Jul 03, 2018 at 09:23:07AM -0400, David Mertz wrote:
My problem with the second idea is that *I* find it very wrong to have something in itertools that does not return an iterator. It wrecks the combinatorial algebra of the module.
hmm -- that seems to be a pretty pedantic approach -- practicality beats purity, after all :-)
I think we should first decide if a grouping() function is a useful addition to the standard library (after all: "not every two line function needs to in the stdlib"), and f so, then we can find a home for it.
personally, I'm wondering if a "dicttools" or something module would make sense -- I imagine there are all sorts of other handy utilities for working with dicts that could go there. (though, yeah, we'd want to actually have a handful of these before creating a new module :-) )
That said, it's easy to fix... and I believe independently useful. Just
make grouping() a generator function rather than a plain function. This lets us get an incremental grouping of an iterable.
We already have something which lazily groups an iterable, returning groups as they are seen: groupby.
What makes grouping() different from groupby() is that it accumulates ALL of the subgroups rather than just consecutive subgroupings.
well, yeah, but it wont actually get you those until you exhaust the iterator -- so while it's different than itertools.groupby, it is different than itertools.groupby(sorted(iterable))?
In short, this wouldn't really solve the problems that itertools.groupby has for this sort of task -- so what's the point?
As for where it belongs, perhaps the collections module is the least
worst fit.
That depends some on whether we go with a simple function, in which case collections is a pretty bad fit (but maybe still the least worse).
Personally I still like the idea of having this be special type of dict, rather than "just a function" -- and then it's really obvious where to put it :-)
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/