[Python-Dev] Proposal: defaultdict

Nick Coghlan ncoghlan at gmail.com
Sat Feb 18 04:34:35 CET 2006

Adam Olsen wrote:
> And the pièce de résistance..
> Doc/tools/anno-api.py:51
> It has this:
>     try:
>         info = rcdict[s]
>     except KeyError:
>         sys.stderr.write("No refcount data for %s\n" % s)
>     else:
>         ...
> rcdict is loaded from refcounts.load().  refcounts.load() calls
> refcounts.loadfile(), which has this (inside a loop):
>     try:
>         entry = d[function]
>     except KeyError:
>         entry = d[function] = Entry(function)
> A prime candidate for a default.
> Perhaps the KeyError shouldn't ever get triggered in this case, I'm
> not sure.  I think that's besides the point though.  The programmer
> clearly expected it would.

Assuming the following override:

   class EntryDict(dict):
       def on_missing(self, key):
           value = Entry(key)
           self[key] = value
           return value

Then what it means is that the behaviour of "missing functions get an empty 
refcount entry" propagates to the rcdict code.

So the consequence is that the code in anno-api will never print an error 
message - all functions are deemed to have associated refcount data in 

But that would be a bug in refcounts.loadfile: if it returns an EntryDict 
instead of a normal dict it is, in effect, returning an *infinite* dictionary 
that contains refcount definitions for every possible function name (some of 
them are just populated on demand).

So *if* refcounts.loadfile was converted to use an EntryDict, it would need to 
return dict(d) instead of returning d directly.

And this is where the question of whether has_key/__having__ return True or 
False when default_factory is set is important. If they return False, then the 
LBYL (if key in d:) and EAFTP (try/except) approaches give *different answers*.

More importantly, LBYL will never have side effects, whereas EAFTP may.

If the methods always return True (as Martin suggests), then we retain the 
current behaviour where there is no real difference between the two 
approaches. Given the amount of time spent in recent years explaining this 
fact, I don't think it is an equivalence that should be broken lightly (IOW, 
I've persuaded myself that I agree with Martin)

The alternative would be to have an additional query API "will_default" that 
reflects whether or not a given key is actually present in the dictionary ("if 
key not in d.keys()" would serve a similar purpose, but requires building the 
list of keys).


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-Dev mailing list