[Python-Dev] defaultdict proposal round three

Bengt Richter bokr at oz.net
Mon Feb 20 18:10:13 CET 2006

On Mon, 20 Feb 2006 05:41:43 -0800, "Guido van Rossum" <guido at python.org> wrote:

>I'm withdrawing the last proposal. I'm not convinced by the argument
>that __contains__ should always return True (perhaps it should also
>insert the value?), nor by the complaint that a holy invariant would
>be violated (so what?).
>But the amount of discussion and the number of different viewpoints
>present makes it clear that the feature as I last proposed would be
>forever divisive.
>I see two alternatives. These will cause a different kind of
>philosophical discussion; so be it. I'll describe them relative to the
>last proposal; for those who wisely skipped the last thread, here's a
>link to the proposal:
>Alternative A: add a new method to the dict type with the semantics of
>__getattr__ from the last proposal, using default_factory if not None
>(except on_missing is inlined). This avoids the discussion about
>broken invariants, but one could argue that it adds to an already
>overly broad API.
>Alternative B: provide a dict subclass that implements the __getattr__
>semantics from the last proposal. It could be an unrelated type for
>all I care, but I do care about implementation inheritance since it
>should perform just as well as an unmodified dict object, and that's
>hard to do without sharing implementation (copying would be worse).
>Parting shots:
>- Even if the default_factory were passed to the constructor, it still
>ought to be a writable attribute so it can be introspected and
>modified. A defaultdict that can't change its default factory after
>its creation is less useful.
>- It would be unwise to have a default value that would be called if
>it was callable: what if I wanted the default to be a class instance
>that happens to have a __call__ method for unrelated reasons?
You'd have to put it in a lambda: thing_with_unrelated__call__method

>Callability is an elusive propperty; APIs should not attempt to
>dynamically decide whether an argument is callable or not.
>- A third alternative would be to have a new method that takes an
>explicit defaut factory argument. This differs from setdefault() only
>in the type of the second argument. I'm not keen on this; the original
>use case came from an example where the readability of
>  d.setdefault(key, []).append(value)
>was questioned, and I'm not sure that
>  d.something(key, list).append(value)
>is any more readable. IOW I like (and I believe few have questioned)
>associating the default factory with the dict object instead of with
>the call site.
>Let the third round of the games begin!
Sorry if I missed it, but is it established that defaulting lookup
will be spelled the same as traditional lookup, i.e. d[k] or d.__getitem__(k) ?

IOW, are default-enabled dicts really going to be be passed
into unknown contexts for use as a dict workalike? I can see using on_missing
for external side effects like logging etc., or _maybe_ modifying the dict with
a known separate set of keys that wouldn't be used for the normal purposes of the dict.

ISTM a defaulting dict could only reasonably be passed into contexts that expected it,
but that could still be useful also. How about d = dict() for a totally normal dict,
and d.defaulting to get a view that uses d.default_factory if present? E.g.,

d = dict()
d.default_factory = list
for i,name in enumerate('Eeny Meeny Miny Moe'.split()): # prefix insert order
    d.defaulting[name].append(i)  # or hoist d.defaulting => dd[name].append(i)

Maybe d.defaulting could be a descriptor?

If the above were done, could d.on_missing be independent and always active if present? E.g.,

    d.on_missing = lambda self, key: self.__setitem__(key, 0) or 0

would be allowed to work on its own first, irrespective of whether default_factory was set.
If it created d[key] it would effectively override default_factory if active, and
if not active, it would still act, letting you instrument a "normal" dict with special effects.

Of course, if you wanted to write an on_missing handler to use default_factory like your original
example, you could. So on_missing would always trigger if present, for missing keys, but
d.defaulting[k] would only call d.default_factory if the latter was set and the key was missing
even after on_missing (if present) did something (e.g., it could be logging passively).

Bengt Richter

More information about the Python-Dev mailing list