how can I clear a dictionary in python

Alex Martelli aleax at mac.com
Fri Mar 30 04:14:26 CEST 2007


Russ <uymqlp502 at sneakemail.com> wrote:

> This little squabble got me thinking. I normally just use the
> myDict={} method of "clearing" a
> dictionary when I know there are no other references to it. However, I
> wonder how the
> efficiency of relying on the garbage collector to clear a dictionary
> compares with using the
> "clear" method. Does anyone know?

Well, anybody who bothers to *MEASURE* can start building an idea.

When the dict's already empty:

brain:~ alex$ python -mtimeit 'd={}'
10000000 loops, best of 3: 0.113 usec per loop
brain:~ alex$ python -mtimeit 'd={}; d={}'   
1000000 loops, best of 3: 0.207 usec per loop
brain:~ alex$ python -mtimeit 'd={}; d.clear()'
1000000 loops, best of 3: 0.316 usec per loop

Making one dict costs about 100 nanoseconds, making two of them costs
about 200 (sensible), making one and clearing it 300 (so just the
clearing, on an empty dict, about 200 nanoseconds).

Unfortunately, microbenchmarks of operations which do change the state
their timing depend on are trickier.  Still, here's an attempt:

brain:~ alex$ python -mtimeit -s'D=dict.fromkeys(xrange(99))'
'd=D.copy()'
100000 loops, best of 3: 6.73 usec per loop
brain:~ alex$ python -mtimeit -s'D=dict.fromkeys(xrange(99))'
'd=D.copy();d={}'
100000 loops, best of 3: 6.76 usec per loop
brain:~ alex$ python -mtimeit -s'D=dict.fromkeys(xrange(99))'
'd=D.copy()'
100000 loops, best of 3: 6.73 usec per loop
brain:~ alex$ python -mtimeit -s'D=dict.fromkeys(xrange(99))'
'd=D.copy();d={}'
100000 loops, best of 3: 6.78 usec per loop
brain:~ alex$ python -mtimeit -s'D=dict.fromkeys(xrange(99))'
'd=D.copy();d.clear()'
100000 loops, best of 3: 6.94 usec per loop
brain:~ alex$ python -mtimeit -s'D=dict.fromkeys(xrange(99))'
'd=D.copy();d.clear()'
100000 loops, best of 3: 6.93 usec per loop

Here, making a middly-size dict costs about 6730 nanoseconds.  
Making an empty one as well adds 30-50 nanoseconds; clearing the middly
one instead ads 200 nanoseconds or so.

It would appear that clearing an existing dict costs about twice as much
(or more) as making a new one (200 nanoseconds vs 100 nanoseconds or
less) for different sizes of the existing dict.

This is on a 2GHz Intel Core Duo with Python 2.5 -- measures of a few
tens of nanoseconds can be "noisy" enough to change substantially on
different release of Python or different CPUs.

Fortunately, it's unusual to care about such tiny performance issues
(here, the time to build the dictionary would normally swap the time to
clear it, OR to assign an empty one instead, by over an order of
magnitude, so...).


Alex



More information about the Python-list mailing list