[Python-ideas] OrderedCounter and OrderedDefaultDict

Wes Turner wes.turner at gmail.com
Fri Nov 13 23:46:44 EST 2015


On Fri, Oct 16, 2015 at 9:08 PM, Andrew Barnert via Python-ideas <
python-ideas at python.org> wrote:

> Actually, forget all that; it's even simpler.
>
> At least in recent 3.x, the only thing wrong with inheriting from both
> types, assuming you put OrderedDict first, is the __init__ signature. So:
>
>     class OrderedDefaultDict(OrderedDict, defaultdict):
>         def __init__(self, default_factory=None, *a, **kw):
>             OrderedDict.__init__(self, *a, **kw)
>             self.default_factory = default_factory
>
> More importantly, because __missing__ support is built into dict, despite
> the confusing docs for defaultdict, you don't really need defaultdict at
> all here:
>
>     class OrderedDefaultDict(OrderedDict):
>         def __init__(self, default_factory=None, *a, **kw):
>             OrderedDict.__init__(self, *a, **kw)
>             self.default_factory = default_factory
>         def __missing__(self, key):
>             self[key] = value = default_factory()
>             return value
>
> And either of these should work with 2.5+ (according to
> https://docs.python.org/2/library/stdtypes.html#dict that's when
> dict.__missing__ was added).
>
>
Thanks!

- [ ] Could/should maybe either of these make it into the standard library,
that would save a fair amount of copying.

.. Great in combination w/ dict views:
https://docs.python.org/2/library/stdtypes.html#dictionary-view-objects





> Sent from my iPhone
>
> On Oct 16, 2015, at 18:57, Andrew Barnert <abarnert at yahoo.com> wrote:
>
> Sorry, sent too soon…
>
> Sent from my iPhone
>
> On Oct 16, 2015, at 18:39, Andrew Barnert via Python-ideas <
> python-ideas at python.org> wrote:
>
> On Oct 16, 2015, at 03:36, Ian Foote <ian at feete.org> wrote:
>
> I recently wanted to use an OrderedCounter and an OrderedDefaultDict.
> After a bit of googling I discovered that OrderedCounter is quite easy to
> implement:
>
>
> This is already in the docs for OrderedDict, so it shouldn't have taken
> any googling.
>
> from collections import Counter, OrderedDict
>
> class OrderedCounter(Counter, OrderedDict):
>      'Counter that remembers the order elements are first seen'
>      def __repr__(self):
>          return '%s(%r)' % (self.__class__.__name__,
>                             OrderedDict(self))
>      def __reduce__(self):
>          return self.__class__, (OrderedDict(self),)
>
>
> from https://rhettinger.wordpress.com/2011/05/26/super-considered-super/
>
> Unfortunately an OrderedDefaultDict did not seem so easy. I did find
> http://stackoverflow.com/questions/6190331/can-i-do-an-ordered-default-dict-in-python
> which suggests:
>
>
> Most of the trickiness here is handling None as a factory, which is only
> useful for subclasses that implement a custom __missing__
>
>
> … or that requires two-stage initialization, where it can initialize the
> base with None and then assign the factory later. Still not all that
> common, but it is a separate reason for the extra complexity besides a
> custom __missing__ that doesn't need the factory.
>
> To make something that works like you'd expect except for that part is a
> lot simpler.
>
>
> Off the top of my head (untested, since I'm on my phone):
>
> class DefaultOrderedDict(OrderedDict):
>     def __init__(self, default_factory, *a, **kw):
>         super().__init__(*a, **kw)
>         self.default_factory = default_factory
>     def __getitem__(self, key):
>         try:
>             return super().__getitem__(key)
>         except KeyError:
>             return self.__missing__(key)
>     def __missing__(self, key):
>         self[key] = value = self.default_factory()
>         return value
>     def __repr__(self):
>         return '{}({}, {})'.format(type(self).__name__,
> self.default_factory, dict.__repr__(self))
>
> In fact, if you aren't worried about subclassing, or about being able to
> directly call or pass around a bound__missing__, you can make this even
> simpler.
>
> I may be wrong about this being picklable/copyable without any extra help;
> if so, it still doesn't need the whole complicated "treat a None factory as
> if it doesn't exist" part.
>
> Also, I think the SO recipe works in Python 2.4+, or whichever first added
> OrderedDict, while this requires 3.1 (although it should be obvious how to
> backport it, make sure that OrderedDict isn't an old-style class…).
>
> Anyway, other than those issues, it seems pretty obvious, and I don't
> think there's any part of that functionality that's easy to get wrong.
> Handling a None factory in __missing__ and pickle/copy is easy to get
> wrong, but if you aren't going to subclass this and therefore aren't going
> to try to build that part, you're not going to build it wrong.
>
> And I'm not sure a docs recipe that's mostly code that's unnecessary for
> most uses, hiding the important and simple part, would be a great idea.
>
> from collections import OrderedDict, Callable
> class DefaultOrderedDict(OrderedDict):
>     # Source: http://stackoverflow.com/a/6190500/562769
>     def __init__(self, default_factory=None, *a, **kw):
>         if (default_factory is not None and
>            not isinstance(default_factory, Callable)):
>             raise TypeError('first argument must be callable')
>         OrderedDict.__init__(self, *a, **kw)
>         self.default_factory = default_factory
>
>     def __getitem__(self, key):
>         try:
>             return OrderedDict.__getitem__(self, key)
>         except KeyError:
>             return self.__missing__(key)
>
>     def __missing__(self, key):
>         if self.default_factory is None:
>             raise KeyError(key)
>         self[key] = value = self.default_factory()
>         return value
>
>     def __reduce__(self):
>         if self.default_factory is None:
>             args = tuple()
>         else:
>             args = self.default_factory,
>         return type(self), args, None, None, self.items()
>
>     def copy(self):
>         return self.__copy__()
>
>     def __copy__(self):
>         return type(self)(self.default_factory, self)
>
>     def __deepcopy__(self, memo):
>         import copy
>         return type(self)(self.default_factory,
>                           copy.deepcopy(self.items()))
>
>     def __repr__(self):
>         return 'OrderedDefaultDict(%s, %s)' % (self.default_factory,
>                                                OrderedDict.__repr__(self))
>
>
> This seems to me both easy to get wrong and hard to discover. It also
> risks getting out of sync with updates to collections.
> I'm wondering if this is sufficient justification to add OrderedCounter
> and OrderedDict to collections, either directly or as recipes.
>
>
> Since one of them is already a recipe and the other one would probably not
> be useful as one, I don't think that's a great idea.
>
> Adding OrderedDefaultDict to the module itself might be useful. Or,
> alternatively, put it in PyPI or ActiveState and add a link to it from the
> docs? (I'm pretty sure there are already implementations in both places,
> actually. Also, it would be nice to be able to point at a third-party
> implementation that works in 2.6+/3.2+ or whatever even if it only appears
> in the 3.6 docs.)
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20151113/8b8a9993/attachment.html>


More information about the Python-ideas mailing list