Shelve operations are very slow and create huge files

Tim Churches tchur at optushome.com.au
Sun Nov 2 02:36:00 EST 2003


On Sun, 2003-11-02 at 03:38, Eric Wichterich wrote:
> One script searches the MySQL-database and stores the result, the next 
> script reads the shelve again and processes the result. But there is a 
> problem: if the second script is called too early, the error "(11, 
> 'Resource temporarily unavailable') " occurs.

The only reason to use shelves is if your query results are too large
(in total) to fit in memory, and thus have to be retrieved, stored and
processed row-by-row. 

> So I took a closer look at the file that is generated by the shelf: The 
> result-list from MySQL-Query contains 14.600 rows with 7 columns. But, 
> the saved file is over 3 MB large and contains over 230.000 lines (!), 
> which seems way too much!

But that doesn't seem to be the case - your query results can easily fit
in memory. However, the query may still take a long time to execute, so
it may be reasonable to want to store or cache the results for further
processing later. However, it is much quicker to just pickle (cPickle)
the results to a gzipped file than to use shelve. The use of gzip
actually speeds things up, provided that your CPU is reasonably fast and
your disc storage system is mundane (any CPU faster than about 500 Mhz
sees gains on most result sets). Also it saves disc space.

Ole Nielsen and Peter Christen have written a neat set of Python
functions which will automatically handle the caching of query results
from an MySQL datadase in gzipped pickles - see
http://csl.anu.edu.au/ml/dm/dm_software.html - except the files don't
seem to be available from that page - Ole and Peter, please fix!

-- 

Tim C

PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
or at http://members.optushome.com.au/tchur/pubkey.asc
Key fingerprint = 8C22 BF76 33BA B3B5 1D5B  EB37 7891 46A9 EAF9 93D0


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/python-list/attachments/20031102/45054b0a/attachment.sig>


More information about the Python-list mailing list