[Python-Dev] Counter proposal: multidict

Ian Bicking ianb at colorstudy.com
Fri Feb 17 21:51:26 CET 2006

Guido van Rossum wrote:
> On 2/17/06, Ian Bicking <ianb at colorstudy.com> wrote:
>>I really don't like that defaultdict (or a dict extension) means that
>>x[not_found] will have noticeable side effects.  This all seems to be a
>>roundabout way to address one important use case of a dictionary with
>>multiple values for each key, and in the process breaking an important
>>quality of good Python code, that attribute and getitem access not have
>>noticeable side effects.
>>So, here's a proposed interface for a new multidict object, borrowing
>>some methods from Set but mostly from dict.  Some things that seemed
>>particularly questionable to me are marked with ??.
> Have you seen my revised proposal (which is indeed an addition to the
> standard dict rather than a subclass)?

Yes, and though it is more general it has the same issue of side
effects.  Doesn't it seem strange that getting an item will change the
values of .keys(), .items(), and .has_key()?

> Your multidict addresses only one use case for the proposed behavior;
> what's so special about dicts of lists that they should have special
> support? What about dicts of dicts, dicts of sets, dicts of
> user-defined objects?

What's so special?  95% (probably more!) of current use of .setdefault()
is .setdefault(key, []).append(value).

Also, since when do features have to address all possible cases? 
Certainly there are other cases, and I think they can be answered with 
other classes.  Here are some current options:

.setdefault() -- works with any subtype; slightly less efficient than 
what you propose.  Awkward to read; doesn't communicate intent very well.

UserDict -- works for a few cases where you want to make dict-like 
objects.  Messes up the concept of identity and containment -- resulting 
objects both "are" dictionaries, and "contain" a dictionary (obj.data).

DictMixin -- does anything you can possibly want, requiring only the 
overriding of a couple methods.

dict subclassing -- does anything you want as well, but you typically 
have to override many more methods than with DictMixin (and if you don't 
have to override every method, that's not documented in any way).  Isn't 
written with subclassing in mind.  Really, you are proposing that one 
specific kind of override be made feasible, either with subclassing or 
injecting a method.

That said, I'm not saying that several kinds of behavior shouldn't be 
supported.  I just don't see why dict should support them all (or 
multidict).  And I also think dict will support them poorly.

multidict implements one behavior *well*.  In a documented way, with a 
name people can refer to.  I can say "multidict", I can't say "a dict 
where I set default_factory to list" (well, I can say that, but that 
just opens up yet more questions and clarifications).

Some ways multidict differs from default_factory=list:

* __contains__ works (you have to use .get() with default_factory to get 
a meaningful result)
* Barring cases where there are exceptions, x[key] and x.get(key) return 
the same value for multidict; with default_factory one returns [] and 
the other returns None when the key isn't found.  But if you do x[key]; 
x.get(key) then x.get(key) always returns [].
* You can't use __setitem__ to put non-list items into a multidict; with 
multidict you don't have to guard against non-sequences values.
* [] is meaningful not just as the default value, but as a null value; 
the multidict implementation respects both aspects.
* Specific method x.add(key, value) that indicates intent in a way that 
x[key].append(value) does not.
* items and iteritems return values meaningful to the context (a list of 
(key, single_value) -- this is usually what I want, and avoids a nested 
for loop).  __len__ also usefully different than in dict.
* .update() handles iteritems sensibly, and updates from dictionaries 
sensibly -- if you mix a default_factory=list dict with a "normal" 
(single-value) dictionary you'll get an effectively corrupted dictionary 
(where some keys are lists)
* x.getfirst(key) is useful
* I think this will be much easier to reason about in situations with 
threads -- dict acts very predictably with threads, and people rely upon 
* multidict can be written either with subclassing intended, or with an 
abstract superclass, so that other kinds of specializations of this 
superset of the dict interface can be made more easily (if DictMixin 
itself isn't already sufficient)

So, I'm saying: multidict handles one very common collection need that 
dict handles awkwardly now.  multidict is a meaningful and useful class 
with its own identity/name and meaning separate from dict, and has 
methods that represent both the intersection and the difference between 
the two classes.  multidict does not in any way preclude other 
collection objects for other situations; it is entirely unfair to expect 
a new class to solve all issues.  multidict suggests an interface that 
other related classes can use (e.g., an ordered version).  multidict, 
unlike default_factory, is not just a recipe for creating a specific and 
commonly needed object, it is a class for creating it.

Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

More information about the Python-Dev mailing list