[Python-Dev] Proposal: defaultdict
Ian Bicking
ianb at colorstudy.com
Fri Feb 17 18:58:30 CET 2006
Raymond Hettinger wrote:
>>>Over lunch with Alex Martelli, he proposed that a subclass of dict
>>>with this behavior (but implemented in C) would be a good addition to
>>>the language
>
>
> I would like to add something like this to the collections module, but a PEP is
> probably needed to deal with issues like:
>
> * implications of a __getitem__ succeeding while get(value, x) returns x
> (possibly different from the overall default)
> * implications of a __getitem__ succeeding while __contains__ would fail
> * whether to add this to the collections module (I would say yes)
> * whether to allow default functions as well as default values (so you could
> instantiate a new default list)
> * comparing all the existing recipes and third-party modules that have already
> done this
> * evaluating its fitness for common use cases (i.e. bags and dict of lists).
It doesn't seem that useful for bags, assuming we're talking about an
{object: count} implementation of bags; bags should really have a more
set-like interface than a dict-like interface.
A dict of lists typically means a multi-valued dict. In that case it
seems like x[key_not_found] should return the empty list, as that means
zero values; even though zero values also means that
x.has_key(key_not_found) should return False as well. *but* getting
x[key_not_found] does not (for a multi-valued dict) mean that suddently
has_key should return true. I find the side-effect nature of
__getitem__ as proposed in default_dict to be rather confusing, and when
reading code it will very much break my expectations. I assume that
attribute access and [] access will not have side effects. Coming at it
from that direction, I'm -1, though I'm +1 on dealing with the specific
use case that started this (x.setdefault(key, []).append(value)).
An implementation targetted specifically at multi-valued dictionaries
seems like it would be better. Incidentally, on Web-SIG we've discussed
wsgiref, and it includes a mutli-values, ordered, case-insensitive
dictionary. Such a dictionary(ish) object has clear applicability for
HTTP headers, but certainly it is something I've used many times
elsewhere. In a case-sensitive form it applies to URL variables.
Really there's several combinations of features, each with different uses.
So we have now...
dicts: unordered, key:value (associative), single-value
sets: unordered, not key:value, single-value
lists: ordered, not key:value, multi-value
We don't have...
bags: unordered, not key:value, multi-value
multi-dict: unordered, key:value, multi-value
ordered-dict: ordered, key:value, single-value
ordered-multi-dict: ordered, key:value, single-value
For all key:value collections, normalized keys can be useful. (Though
notably the wsgiref Headers object does not have normalized keys, but
instead does case-insensitive comparisons.) I don't know where
dict-of-dict best fits in here.
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Python-Dev
mailing list