[Python-Dev] Simple dicts

Tim Peters tim.one@comcast.net
Sat, 17 May 2003 21:12:12 -0400


[Damien Morton]
> ...
> On the other hand, as Tim pointed out to me in a private email, there is
> so much overhead in just getting to the hashtable inner loop, going
> around that loop one time instead of two or three seems inconsequential.

For in-cache tables.  For out-of-cache tables, each trip around the loop is
deadly.  Since heavily used portions of small dicts are likely to be
in-cache no matter how they're implemented, that's what makes me dubious
about pouring lots of effort into reducing collisions for small dicts
specifically.

> ...
> There seem to be two different ways to get/set/del from a dictionary.
>
> The first is using PyDict_[Get|Set|Del]Item()
>
> The second is using the embarssingly named dict_ass_sub() and its
> partner dict_subscript().
>
> Which of these two access methods is most likely to be used?

My guess matches Guido's:  PyDict_*, except in programs making heavy use of
explicit Python dicts.  All programs use dicts under the covers for
namespace mapping, and, e.g., instance.attr and module.attr end up calling
PyDict_GetItem() directly.  Python-level explicit dict subscripting ends up
calling dict_*, essentially because Python has no idea at compile-time
whether the x in

    x[y]

*is* a dict, so generates code that goes thru the all-purpose type-dispatch
machinery.  On the third hand, some explicit-dict slinging code seems to use

    x = somedict.get(y)

everywhere, and dict_get() doesn't call PyDict_GetItem() or
dict_subscript().