Populating huge data structures from disk
horpner at yahoo.com
Tue Nov 6 22:43:26 CET 2007
On 2007-11-06, Michael Bacarella <mbac at gpshopper.com> wrote:
> And there's no solace in lists either:
> $ time python eat800.py
> real 4m2.796s
> user 3m57.865s
> sys 0m3.638s
> $ cat eat800.py
> import struct
> d = 
> f = open('/dev/zero')
> for i in xrange(100000000):
> cPickle with protocol 2 has some promise but is more complicated because
> arrays can't be pickled. In a perfect world I could do something like this
> somewhere in the backroom:
> x = lengthy_number_crunching()
> and in the application do...
> x = magic.mmap("/important-data")
> and once the mlock finishes bringing important-data into RAM, at
> the speed of your disk I/O subsystem, all accesses to x will be
> hits against RAM.
> Any thoughts?
Disable the garbage collector, use a while loop and manual index
instead of an iterator, preallocate your list, e.g.,
[None]*100000000, and hope they don't have blasters!
More information about the Python-list