Huge dictionary, 1 min to create, 6 to delete

Ken Kinder kkinder at tridog.com
Thu Aug 31 12:42:28 EDT 2000


Here's the deal. Python's dictionary type uses a hash code to lookup and handle
items. For that reason, it gets to be big -- very big. Unless you have
non-primitive datatypes in the dictionary (ie; references) the deletion shouldn't
be a big deal, but it still might be doing some searching through lots of memory.
Also, I tested this on a few different boxes and got differing results.

If you need a really high performance mapping datatype, you might consider making
one in C or C++ and importing it in Python. If you use an alternate coding system,
you can trade of speeds of things like lookup for memory size. Also, you could hard
code it to only take numbers and avoid simple datatyping.

For more information on various mapping types and how to implement them, I would go
down to the bookstore and get a C/C++ algorythms book. I think O'Reilly has a nice
one.

Rick Pasotto wrote:

> On Thu, 31 Aug 2000 00:12:52 -0700 in comp.lang.python, Emile van Sebille wrote:
> > Can you pare the code down to an example the
> > exhibits the problem and post it.  When I create
> > a dictionary with 1M entries and exit, I experience
> > no delay.
>
> This is what I got on my machine (linux 2.2.16, PII 400, 192m):
>
> <start prog>
> import time
>
> print "    ",time.ctime(time.time())
> dct = {}
> for i in range(1000000):
>         try:
>                 k = str(i)
>                 v = k + ":" + k
>                 dct[k] = v
>         except:
>                 print i
> print "    ",time.ctime(time.time())
> </end prog>
>
> @tc:~/python$ date; python dict.py; date
> Thu Aug 31 09:11:44 EDT 2000
>     Thu Aug 31 09:11:44 2000
>     Thu Aug 31 09:13:03 2000
> Thu Aug 31 09:14:36 EDT 2000
>
> Both the creation and the deletion took ~1m20s.
>
> > "haaserd" <haaserd at yahoo.com> wrote in message
> > news:39AD8BDA.84DB5BB3 at yahoo.com...
> > > As a learning exercise, I decided to use the python
> > > dictionary in a program which tries to create crossword
> > > puzzles (solutions less clues).  In doing so I created a
> > > dictionary with about 1,000,000 entries.  This takes about
> > a
> > > minute on my AMD 700 processor, with very little(no) disk
> > > activity.
> > >
> > > My problem is that when the program ends, it takes about 6
> > > minutes, during which there is very heavy disk activity.
> > As
> > > a test, I did a dict.clear(), and had the same result.
> > >
> > > Is this the garbage collection problem mentioned briefly
> > in
> > > a few discussions?  Or am I just hitting a Windows 98
> > paging
> > > problem unrelated to python?  I have 128 MB of memory.
> > >
> > > TIA
> > >
> > > Roger Haase
> > >
> >
> >
>
> --
> "Moderation in temper is always a virtue; but moderation in
>  principle is always a vice."
>                 -- Thomas Paine, _The Rights of Man_ (1791)
>                    Rick Pasotto email: rickp at vnet.net
> --
> http://www.python.org/mailman/listinfo/python-list
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kkinder.vcf
Type: text/x-vcard
Size: 257 bytes
Desc: Card for Ken Kinder
URL: <http://mail.python.org/pipermail/python-list/attachments/20000831/8db0bb56/attachment.vcf>


More information about the Python-list mailing list