need optimizing help

Robert Brewer fumanchu at
Sat Mar 13 22:44:43 CET 2004

rabbits77 wrote:
> > I have a dictionary with a very very large(possibly millions) of
> > key/value pairs.
> > The key is a tuple that looks like (id,date)
> > What is the fastest way to get out all of the values that match any
> > key given that they individual key elements are coming from two
> > seperate lists?
> > The approach of
> > for id in IDS:
> >     for date in dates:
> >         data=myDict[(id,date)]
> > 
> > seems to just take too long. Is there a speedier, more pythonic, way
> > of doing this? Any help speeding this up would be much appreciated!!

and I replied:
> If you're willing to handle some minor side-effects, one 
> common approach is an index layer via a nested dict;
> that is, instead of:
> myDict[(id, date)] = value
> execute:
> myIndex.setdefault(id, {})[date] = value

Postscript: it may be that your 'id' and 'date' values are properties of
whatever objects you're putting into the dictionary. If so, and you want
such a beast which does the indexing semi-automatically, try the Index
class I just made last week (ignore the Chain class--that solves a
different problem): ->

It uses parallel dictionaries, not nested your example, you
might write:

>>> import buckets
>>> import datetime
>>> class Thing(object):
... 	def __init__(self, id, date):
... = id
... = date
>>> myDict = buckets.Index('id', 'date')
>>>,, 12, 25)))
>>>,, 12, 26)))
>>> print myDict
{'date': {, 12, 25): [<__main__.Thing object at
0x01170390>],, 12, 26): [<__main__.Thing object at
0x011708B0>]}, 'id': {0: [<__main__.Thing object at 0x01170390>], 1:
[<__main__.Thing object at 0x011708B0>]}}

By the way, if any gurus out there have recommendations for improving
the Index class (even rewriting it in C), I'd love to hear them and see
something like it included in the Library.

Robert Brewer
Amor Ministries
fumanchu at

More information about the Python-list mailing list