On 07/18/2009 01:30 AM, Chris Rebert wrote:
On Fri, Jul 17, 2009 at 11:18 PM, Chris Rebert<pyideas@rebertia.com> wrote:
Truth be told, it's more than just defaultdict(int). It adds .elements() and .most_common().
Seems bag-like to me. - Unordered? Check. - Allows duplicates? Check. - O(1) containment test? Check. - Counts multiplicity of elements? Check. - Iterable? Check.
The only non-bag thing about it is allowing 0 and negative multiplicities, which I agree is unintuitive; I don't like that "feature" either.
Actually, from the docs, it also appears (I don't have 3.0 handy to test) to get len() wrong, using the dict definition of "number of unique items" rather than just "number of items" as would be more appropriate for a bag.
In the event a Bag is not added, +1 for adding a method to Counter to return `sum(count if count> 0 else 0 for count in a_counter.values())`
Cheers, Chris
As well as getting len() wrong, it gets iteration wrong. It iterates over elements with counts of 0 and -1 as well as only iterating once over elements that appear multiple times. Yes you can iterate over .elements(), but this should be the default not a special case. As for adding most_common, it just calls heapq.nlargest(n, self.items(), key=_itemgetter(1)) which anyone can do, and my bag class does. My bag class behaves like a collection and provides a .unique_elements() method that returns the underlying set. You can .add(elem) and .delete(elem) just like you can with a set, or you can manually change their multiplicities like in Counter with bag[elem] = 5 or bag[elem] -= 2. If Counter is supposed to be a collection of elements, this makes no sense:
c = Counter() c['a'] += 1 c['a'] -= 1 'a' in c True
-Michael Lenzen