dict would be very slow for big data
Tim Chase
python.list at tim.thechases.com
Tue May 12 06:21:15 EDT 2009
> i am trying to insert a lot of data into a dict, which may be
> 10,000,000 level.
> after inserting 100000 unit, the insert rate become very slow, 50,000/
> s, and the entire time used for this task would be very long,also.
> would anyone know some solution for this case?
As others have mentioned, you've likely run out of RAM and the
slowness you feel is your OS swapping your process to disk.
If you need fast dict-like access to your data, I'd recommend
shifting to a database -- perhaps the stock "anydbm" module[1].
The only catch is that it only supports strings as keys/values.
But Python makes it fairly easy to marshal objects in/out of
strings. Alternatively, you could use the built-in (as of
Python2.5) sqlite3 module to preserve your datatypes and query
your dataset with the power of SQL.
-tkc
[1]
http://docs.python.org/library/anydbm.html
More information about the Python-list
mailing list