[Python-ideas] OrderedCounter and OrderedDefaultDict

Andrew Barnert abarnert at yahoo.com
Sat Oct 17 03:39:02 CEST 2015


On Oct 16, 2015, at 03:36, Ian Foote <ian at feete.org> wrote:
> 
> I recently wanted to use an OrderedCounter and an OrderedDefaultDict. After a bit of googling I discovered that OrderedCounter is quite easy to implement:

This is already in the docs for OrderedDict, so it shouldn't have taken any googling.
> from collections import Counter, OrderedDict
> 
> class OrderedCounter(Counter, OrderedDict):
>      'Counter that remembers the order elements are first seen'
>      def __repr__(self):
>          return '%s(%r)' % (self.__class__.__name__,
>                             OrderedDict(self))
>      def __reduce__(self):
>          return self.__class__, (OrderedDict(self),)
> 
> from https://rhettinger.wordpress.com/2011/05/26/super-considered-super/
> 
> Unfortunately an OrderedDefaultDict did not seem so easy. I did find http://stackoverflow.com/questions/6190331/can-i-do-an-ordered-default-dict-in-python which suggests:

Most of the trickiness here is handling None as a factory, which is only useful for subclasses that implement a custom __missing__. To make something that works like you'd expect except for that part is a lot simpler. And I'm not sure a docs recipe that's mostly code that's unnecessary for most uses, hiding the important and simple part, would be a great idea.
> from collections import OrderedDict, Callable
> 
> class DefaultOrderedDict(OrderedDict):
>     # Source: http://stackoverflow.com/a/6190500/562769
>     def __init__(self, default_factory=None, *a, **kw):
>         if (default_factory is not None and
>            not isinstance(default_factory, Callable)):
>             raise TypeError('first argument must be callable')
>         OrderedDict.__init__(self, *a, **kw)
>         self.default_factory = default_factory
> 
>     def __getitem__(self, key):
>         try:
>             return OrderedDict.__getitem__(self, key)
>         except KeyError:
>             return self.__missing__(key)
> 
>     def __missing__(self, key):
>         if self.default_factory is None:
>             raise KeyError(key)
>         self[key] = value = self.default_factory()
>         return value
> 
>     def __reduce__(self):
>         if self.default_factory is None:
>             args = tuple()
>         else:
>             args = self.default_factory,
>         return type(self), args, None, None, self.items()
> 
>     def copy(self):
>         return self.__copy__()
> 
>     def __copy__(self):
>         return type(self)(self.default_factory, self)
> 
>     def __deepcopy__(self, memo):
>         import copy
>         return type(self)(self.default_factory,
>                           copy.deepcopy(self.items()))
> 
>     def __repr__(self):
>         return 'OrderedDefaultDict(%s, %s)' % (self.default_factory,
>                                                OrderedDict.__repr__(self))
> 
> This seems to me both easy to get wrong and hard to discover. It also risks getting out of sync with updates to collections.
> I'm wondering if this is sufficient justification to add OrderedCounter and OrderedDict to collections, either directly or as recipes.

Since one of them is already a recipe and the other one would probably not be useful as one, I don't think that's a great idea.

Adding OrderedDefaultDict to the module itself might be useful. Or, alternatively, put it in PyPI or ActiveState and add a link to it from the docs? (I'm pretty sure there are already implementations in both places, actually. Also, it would be nice to be able to point at a third-party implementation that works in 2.6+/3.2+ or whatever even if it only appears in the 3.6 docs.)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20151016/088f0314/attachment-0001.html>


More information about the Python-ideas mailing list