Populating a dictionary, fast [SOLVED]
Francesc Altet
faltet at carabos.com
Tue Nov 13 11:27:11 EST 2007
A Monday 12 November 2007, Michael Bacarella escrigué:
> As for the solution, after trying a half-dozen different integer
> hashing functions
> and hash table sizes (the brute force approach), on a total whim I
> switched to a
> model with two dictionary tiers and got whole orders of magnitude
> better performance.
>
> The tiering is, for a given key of type long:
>
> id2name[key >> 40][key & 0x10000000000] = name
>
> Much, much better. A few minutes versus hours this way.
>
> I suspect it could be brought down to seconds with a third level of
> tiers but this is no longer posing the biggest bottleneck... ;)
I don't know exactly why do you need a dictionary for keeping the data,
but in case you want ultra-fast access to values, there is no
replacement for keeping a sorted list of keys and a list with the
original indices to values, and the proper list of values. Then, to
access a value, you only have to do a binary search on the sorted list,
another lookup in the original indices list and then go straight to the
value in the value list. This should be the faster approach I can
think of.
Another possibility is using an indexed column in a table in a DB.
Lookups there should be much faster than using a dictionary as well.
HTH,
--
>0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
"-"
More information about the Python-list
mailing list