Why defaultdict?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Jul 2 00:11:49 EDT 2010


I would like to better understand some of the design choices made in 
collections.defaultdict.

Firstly, to initialise a defaultdict, you do this:

from collections import defaultdict
d = defaultdict(callable, *args)

which sets an attribute of d "default_factory" which is called on key 
lookups when the key is missing. If callable is None, defaultdicts are 
*exactly* equivalent to built-in dicts, so I wonder why the API wasn't 
added on to dict rather than a separate class that needed to be imported. 
That is:

d = dict(*args)
d.default_factory = callable

If you failed to explicitly set the dict's default_factory, it would 
behave precisely as dicts do now. So why create a new class that needs to 
be imported, rather than just add the functionality to dict?

Is it just an aesthetic choice to support passing the factory function as 
the first argument? I would think that the advantage of having it built-
in would far outweigh the cost of an explicit attribute assignment.



Second, why is the factory function not called with key? There are three 
obvious kinds of "default values" a dict might want, in order of more-to-
less general:

(1) The default value depends on the key in some way: return factory(key)
(2) The default value doesn't depend on the key: return factory()
(3) The default value is a constant: return C

defaultdict supports (2) and (3):

defaultdict(factory, *args)
defaultdict(lambda: C, *args)

but it doesn't support (1). If key were passed to the factory function, 
it would be easy to support all three use-cases, at the cost of a 
slightly more complex factory function. E.g. the current idiom:

defaultdict(factory, *args)

would become:

defaultdict(lambda key: factory(), *args)


(There is a zeroth case as well, where the default value depends on the 
key and what else is in the dict: factory(d, key). But I suspect that's 
well and truly YAGNI territory.)

Thanks in advance,



-- 
Steven



More information about the Python-list mailing list