[pypy-dev] fun with dictionaries

Michael Hudson mwh at python.net
Fri Aug 11 17:49:50 CEST 2006


So part of WP06 is improving the dictionary implementation.  To do
this, it seemed like a good idea to find out what Python code does
with dictionaries, which is what I've been working on this week.

If you activate the "objspace.std.withmultidict" option, make
MEASURE_DICT in pypy/objspace/std/dictmultiobject.py true and build
yourself a pypy-c, you'll find that this pypy-c will create a
dictinfo.txt file that summarizes how every dictionary in the program
has been used.

The benchmark programs I have been using are: pystone, richards,
"rst2html coding-guide.txt" and "translate.py --backendopt
--no-compile --batch --text targetrpystonedalone.py", and the
(compressed) results can be found in:

    http://codespeak.net/~mwh/dictinfo/

The file dictinfos.tar.bz2 contains the dictinfo.txt files created by
the above runs, and the RData files are binary files suitable for
loading into R:

    http://www.r-project.org/

What I'd like to get some input on is stuff like: what aspects of this
data should I analyse?  Is there any data you think I should collect?

Something that I don't measure at all is the order things happen in,
which might be interesting: it's easy to believe many dictionaries go
through a phase of being written to before a longer pahse of being
read from.  But I'm not sure how to measure that...

Cheers,
mwh

-- 
  <glyph> I am *not* a PSU agent.
                                                -- from Twisted.Quotes




More information about the Pypy-dev mailing list