[Python-ideas] Proposal for extending the collections module - bags / multisets, ordered sets / unique lists

Michael Lenzen m.lenzen at gmail.com
Sat Jul 18 17:11:47 CEST 2009


On 07/18/2009 01:30 AM, Chris Rebert wrote:
> On Fri, Jul 17, 2009 at 11:18 PM, Chris Rebert<pyideas at rebertia.com>  wrote:
>> Truth be told, it's more than just defaultdict(int). It adds
>> .elements() and .most_common().
>>
>> Seems bag-like to me.
>> - Unordered? Check.
>> - Allows duplicates? Check.
>> - O(1) containment test? Check.
>> - Counts multiplicity of elements? Check.
>> - Iterable? Check.
>>
>> The only non-bag thing about it is allowing 0 and negative
>> multiplicities, which I agree is unintuitive; I don't like that
>> "feature" either.
>
> Actually, from the docs, it also appears (I don't have 3.0 handy to
> test) to get len() wrong, using the dict definition of "number of
> unique items" rather than just "number of items" as would be more
> appropriate for a bag.
>
> In the event a Bag is not added, +1 for adding a method to Counter to
> return `sum(count if count>  0 else 0 for count in
> a_counter.values())`
>
> Cheers,
> Chris


As well as getting len() wrong, it gets iteration wrong.  It iterates 
over elements with counts of 0 and -1 as well as only iterating once 
over elements that appear multiple times.  Yes you can iterate over 
.elements(), but this should be the default not a special case.

As for adding most_common, it just calls
heapq.nlargest(n, self.items(), key=_itemgetter(1))
which anyone can do, and my bag class does.

My bag class behaves like a collection and provides a .unique_elements() 
method that returns the underlying set.  You can .add(elem) and 
.delete(elem) just like you can with a set, or you can manually change 
their multiplicities like in Counter with bag[elem] = 5 or bag[elem] -= 2.

If Counter is supposed to be a collection of elements, this makes no sense:
 >>> c = Counter()
 >>> c['a'] += 1
 >>> c['a'] -= 1
 >>> 'a' in c
True

-Michael Lenzen



More information about the Python-ideas mailing list