[Python-Dev] Proposal: defaultdict

Nick Coghlan ncoghlan at gmail.com
Fri Feb 17 15:27:48 CET 2006

Phillip J. Eby wrote:
> At 10:10 AM 02/17/2006 +0100, Georg Brandl wrote:
>> Guido van Rossum wrote:
>>>   d = DefaultDict([])
>>> can be written as simply
>>>   d[key].append(value)
>>> Feedback?
>> Probably a good idea, has been proposed multiple times on clpy.
>> One good thing would be to be able to specify either a default value
>> or a factory function.
> +1 on factory function, e.g. "DefaultDict(list)".  A default value isn't 
> very useful, because for immutable defaults, setdefault() works well 
> enough.  If what you want is a copy of some starting object, you can always 
> do something like DefaultDict({1:2,3:4}.copy).

+1 here, too (for permitting a factory function only).

This doesn't really limit usage, as you can still supply 
DefaultDict(partial(copy, x)) or DefaultDict(partial(deepcopy, x)), or (heaven 
forbid) a lambda expression. . .

As others have mentioned, the basic types are all easy, since the typename can 
be used directly.

+1 on supplying that factory function to the constructor, too (the default 
value is a fundamental part of the defaultdict). That is, I'd prefer:

   d = defaultdict(func)
   # The defaultdict is fully defined, but not yet populated


   d = defaultdict(init_values)
   # The defaultdict is partially populated, but not yet fully defined!

That is, something that is the same the normal dict except for:

     def __init__(self, default):
         self.default = default

     def __getitem__(self, key):
         return self.get(key, self.default())

Considering some of Raymond's questions in light of the above
> * implications of a __getitem__ succeeding while get(value, x) returns x 
> (possibly different from the overall default)
> * implications of a __getitem__ succeeding while __contains__ would fail

These behaviours seem reasonable for a default dictionary - "containment" is 
based on whether or not the key actually exists in the dictionary as it 
currently stands, and the default is really a "default default" that can be 
overridden using 'get'.

> * whether to add this to the collections module (I would say yes)
> * whether to allow default functions as well as default values (so you could 
> instantiate a new default list)

My preference is for factory functions only, to eliminate ambiguity.

# bag like behavior
dd = collections.default_dict(int)
for elem in collection:
     dd[elem] += 1

# setdefault-like behavior
dd = collections.default_dict(list)
for page_number, page in enumerate(book):
     for word in page.split():

Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-Dev mailing list