[Python-ideas] Fwd: grouping / dict of lists
Chris Barker
chris.barker at noaa.gov
Fri Jul 13 15:16:08 EDT 2018
On Fri, Jul 13, 2018 at 12:38 PM, Michael Selik <mike at selik.org> wrote:
> Thanks for linking to these.
>
yup -- real use cases are really helpful.
Though the other paradigm for grouping is use of setdefault() rather than
defaultdict. So it would be nice to look for those, too.
> I looked at many of them in my own research, but for some reason didn't
> think to write down the links. I'll respond to each one separately.
>
> Throughout, I'm going to use my proposed ``grouped`` builtin to
> demonstrate possible revisions. Note that I am *not* suggesting a
> replacement to defaultdict. The proposal is to make a common task easier
> and more reliable. It does not satisfy all uses of defaultdict.
>
agreed -- and it shouldn't.
I"d like to see how some of these pan our with my proposed API:
either a Grouped class, or at least (key, value) iterables and/or a value
function.
I don't have time now to do them all, but for the moment:
I noticed recently that *all* examples for collection.defaultdict (
>> https://docs.python.org/3.7/library/collections.html#
>> collections.defaultdict) are cases of grouping (for an int, a list and a
>> set) from an iterator with a key, value output.
>>
>
and yet others on this thread think a (key, value) input would be rare -- I
guess it depends on whether you are thinking dict-like already....
>
>> https://frama.link/o3Hb3-4U,
>>
>
> accum = defaultdict(list)
> garbageitems = []
>
> for item in root:
> filename = findfile(opts.fileroot, item.attrib['classname'])
> accum[filename].append(float(item.attrib['time']))
> if filename is None:
> garbageitems.append(item)
>
>
> This might be more clear if separated into two parts.
>
> def keyfunc(item):
> return findfile(opts.fileroot, item.attrib['classname'])
> groups = grouped(root, keyfunc)
> groups = {k: [float(v.attrib['time']) for v in g] for k, g in
> groups.items()}
> garbage = groups.pop(None, [])
>
so this one is a prime case for a value function -- I think post-processing
the groups is a pretty common case -- why make people post-process it?
def keyfunc(item):
return findfile(opts.fileroot, item.attrib['classname'])
def valuefunc(item):
float(item.attrib['time'])
groups = grouped(root, keyfunc, valuefunc)
garbage = groups.pop(None, [])
And the post-processing is then mixing comprehension style with key
function style (what to call that -- "functional" style?), so why not use a
(key, value) iterable:
groups = grouped((findfile(opts.fileroot, item.attrib['classname']),
item.attrib['time'])
for item in root))
OK -- that's packing a bit too much into a line, so how about:
def keyfunc(item):
return findfile(opts.fileroot, item.attrib['classname'])
groups = grouped( (keyfunc(item), item.attrib['time']) for item in root)
>
> self.mapping = collections.defaultdict(set)
> for op in (op for op in graph.get_operations()):
> if op.name.startswith(common.SKIPPED_PREFIXES):
> continue
> for op_input in op.inputs:
> self.mapping[op_input].add(op)
>
>
> This is a case of a single element being added to multiple groups, which
> is your section B, below. The loop and filter could be better. It looks
> like someone intended to convert if/continue to a comprehension, but
> stopped partway through the revision.
>
yeah, this is weird --
But it does make a case for having a class with the option f using a set to
collect (which I have in an older commit of my prototype:
inputs = ((op_input, op) for op in ops for op_input in op.inputs)
groups = Grouping(inputs, key=itemgetter(0), collection=set)
otherwise, you could have a method to do it:
groups.map_on_groups(set)
(not sure I like that method name, but I hope you get the idea)
OK, back to work.
-CHB
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180713/4f77b60a/attachment-0001.html>
More information about the Python-ideas
mailing list