PyToolz, Pandas, Dask .groupby()toolz.itertoolz.groupby does this succinctly without any new/magical/surprising syntax."""def groupby(key, seq):""" Group a collection by a key function>>> names = ['Alice', 'Bob', 'Charlie', 'Dan', 'Edith', 'Frank']>>> groupby(len, names) # doctest: +SKIP{3: ['Bob', 'Dan'], 5: ['Alice', 'Edith', 'Frank'], 7: ['Charlie']}>>> iseven = lambda x: x % 2 == 0>>> groupby(iseven, [1, 2, 3, 4, 5, 6, 7, 8]) # doctest: +SKIP{False: [1, 3, 5, 7], True: [2, 4, 6, 8]}Non-callable keys imply grouping on a member.>>> groupby('gender', [{'name': 'Alice', 'gender': 'F'},... {'name': 'Bob', 'gender': 'M'},... {'name': 'Charlie', 'gender': 'M'}]) # doctest:+SKIP{'F': [{'gender': 'F', 'name': 'Alice'}],'M': [{'gender': 'M', 'name': 'Bob'},{'gender': 'M', 'name': 'Charlie'}]}See Also:countby"""if not callable(key):key = getter(key)d = collections.defaultdict(lambda: [].append)for item in seq:d[key(item)](item)rv = {}for k, v in iteritems(d):rv[k] = v.__self__return rv"""If you're willing to install Pandas (and NumPy, and ...), there's pandas.DataFrame.groupby:Dask has a different groupby implementation:_______________________________________________
On Thursday, June 28, 2018, Chris Barker via Python-ideas <python-ideas@python.org> wrote:On Thu, Jun 28, 2018 at 8:25 AM, Nicolas Rolin <nicolas.rolin@tiime.fr> wrote:--I use list and dict comprehension a lot, and a problem I often have is to do the equivalent of a group_by operation (to use sql terminology).I don't know from SQL, so "group by" doesn't mean anything to me, but this:For example if I have a list of tuples (student, school) and I want to have the list of students by school the only option I'm left with is to write
student_by_school = defaultdict(list)
for student, school in student_school_list:
student_by_school[school].append(student)seems to me that the issue here is that there is not way to have a "defaultdict comprehension"I can't think of syntactically clean way to make that possible, though.Could itertools.groupby help here? It seems to work, but boy! it's ugly:In [45]: student_school_list
Out[45]:
[('Fred', 'SchoolA'),
('Bob', 'SchoolB'),
('Mary', 'SchoolA'),
('Jane', 'SchoolB'),
('Nancy', 'SchoolC')]
In [46]: {a:[t[0] for t in b] for a,b in groupby(sorted(student_school_list, key=lambda t: t[1]), key=lambda t: t[
...: 1])}
...:
...:
...:
...:
...:
...:
...:
Out[46]: {'SchoolA': ['Fred', 'Mary'], 'SchoolB': ['Bob', 'Jane'], 'SchoolC': ['Nancy']}
-CHB
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/