Populating a dictionary, fast [SOLVED SOLVED]
Chris Mellon
arkanes at gmail.com
Thu Nov 15 11:51:08 EST 2007
On Nov 14, 2007 5:26 PM, Steven D'Aprano
<steve at remove-this-cybersource.com.au> wrote:
> On Wed, 14 Nov 2007 18:16:25 +0100, Hrvoje Niksic wrote:
>
> > Aaron Watters <aaron.watters at gmail.com> writes:
> >
> >> On Nov 12, 12:46 pm, "Michael Bacarella" <m... at gpshopper.com> wrote:
> >>>
> >>> > It takes about 20 seconds for me. It's possible it's related to
> >>> > int/long
> >>> > unification - try using Python 2.5. If you can't switch to 2.5, try
> >>> > using string keys instead of longs.
> >>>
> >>> Yes, this was it. It ran *very* fast on Python v2.5.
> >>
> >> Um. Is this the take away from this thread? Longs as dictionary keys
> >> are bad? Only for older versions of Python?
> >
> > It sounds like Python 2.4 (and previous versions) had a bug when
> > populating large dicts on 64-bit architectures.
>
> No, I found very similar behaviour with Python 2.5.
>
>
> >> Someone please summarize.
> >
> > Yes, that would be good.
>
>
> On systems with multiple CPUs or 64-bit systems, or both, creating and/or
> deleting a multi-megabyte dictionary in recent versions of Python (2.3,
> 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes,
> compared to seconds if the system only has a single CPU. Turning garbage
> collection off doesn't help.
>
>
I can't duplicate this in a dual CPU (64 bit, but running in 32 bit
mode with a 32 bit OS) system. I added keys to a dict until I ran out
of memory (a bit over 22 million keys) and deleting the dict took
about 8 seconds (with a stopwatch, so not very precise, but obviously
less than 30 minutes).
>>> d = {}
>>> idx = 0
>>> while idx < 1e10:
... d[idx] = idx
... idx += 1
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
MemoryError
>>> len(d)
22369622
>>> del d
More information about the Python-list
mailing list