My question would be : does it have to be a key function ? Can't we just remove the "key" argument ?

Because for pretty much all the given examples, I would find my default as readable and nearly as short as the "key" syntax :

> grouping(words, key=len)

grouping((len(word), word for word in words))

>grouping(names, key=itemgetter(0))

grouping((name_initial, name_initial+_name for name_initial, *_name in names)

>grouping(contacts, key=itemgetter('city')

grouping((contact['city'], contact for contact in contacts)

>grouping(employees, key=itemgetter('department'))
grouping((employee['department'], employee for employee in employees)

>grouping(os.listdir('.'), key=lambda filepath: os.path.splitext(filepath)[1])
grouping((os.path.splitext(filepath)[1]), filepath for filepath in os.listdir('.'))

>grouping(transactions, key=lambda v: 'debit' if v > 0 else 'credit')

grouping(('debit' if v > 0 else 'credit', transaction_amount for transaction_amount in transactions))

The code is slightly more verbose, but it is akin to filter(iterable, function) vs (i for i in iterable if function(i)).

Nicolas Rolin

2018-07-02 11:52 GMT+02:00 Michael Selik <mike@selik.org>:

On Mon, Jul 2, 2018 at 2:32 AM Nicolas Rolin <nicolas.rolin@tiime.fr> wrote:
I think the current default quite weird, as it pretty much account to a count() of each key (which can be useful, but not really what I except from a grouping). I would prefer a default that might return an error to a default that says ok and output something that is not what I might want.
For example the default could be such that grouping unpack tuples (key, value) from the iterator and do what's expected with it (group value by key). It is quite reasonable, and you have one example with (key, value) in your example, and no example with the current default. It also allows to use syntax of the kind

>grouping((food_type, food_name for food_type, food_name in foods))

which is pretty nice to have.

I'm of two minds on this point. First, I agree that it'd be nice to handle the (key, value) pair case more elegantly. It comes to mind often when writing examples, even if proportionally less in practice.

Second, I'll paraphrase "Jakob's Law of the Internet User Experience" -- users spend most of their time using *other* functions. Because itertools.groupby and other functions in Python established a standard for the behavior of key-functions, I want to keep that standard.

Third, some classes might have a rich equality method that allows many interesting values to all wind up in the same group even if using the default "identity" key-function.

Thanks for the suggestion. I'll include it in the PEP, at least for documenting all reasonable options.

--
Nicolas Rolin | Data Scientist
+ 33 631992617 - nicolas.rolin@tiime.fr

15 rue Auber, 75009 Paris
www.tiime.fr