Shelve operations are very slow and create huge files

Ole Moller Nielsen ole at cuttlefish.anu.edu.au
Sun Nov 2 22:29:31 EST 2003


	Hi Everyone
	
	Yes we used caching of database queries extensively in several projects.
	Our datamining pages are available at 
	ttp://datamining.anu.edu.au/
	and the dm software and caching.py at
	http://datamining.anu.edu.au/software/dmtools/index.html
	
	Good luck
	Ole
	On Sun, Nov 02, 2003 at 06:36:00PM +1100, Tim Churches wrote:
> On Sun, 2003-11-02 at 03:38, Eric Wichterich wrote:
> > One script searches the MySQL-database and stores the result, the next 
> > script reads the shelve again and processes the result. But there is a 
> > problem: if the second script is called too early, the error "(11, 
> > 'Resource temporarily unavailable') " occurs.
> 
> The only reason to use shelves is if your query results are too large
> (in total) to fit in memory, and thus have to be retrieved, stored and
> processed row-by-row. 
> 
> > So I took a closer look at the file that is generated by the shelf: The 
> > result-list from MySQL-Query contains 14.600 rows with 7 columns. But, 
> > the saved file is over 3 MB large and contains over 230.000 lines (!), 
> > which seems way too much!
> 
> But that doesn't seem to be the case - your query results can easily fit
> in memory. However, the query may still take a long time to execute, so
> it may be reasonable to want to store or cache the results for further
> processing later. However, it is much quicker to just pickle (cPickle)
> the results to a gzipped file than to use shelve. The use of gzip
> actually speeds things up, provided that your CPU is reasonably fast and
> your disc storage system is mundane (any CPU faster than about 500 Mhz
> sees gains on most result sets). Also it saves disc space.
> 
> Ole Nielsen and Peter Christen have written a neat set of Python
> functions which will automatically handle the caching of query results
> from an MySQL datadase in gzipped pickles - see
> http://csl.anu.edu.au/ml/dm/dm_software.html - except the files don't
> seem to be available from that page - Ole and Peter, please fix!
> 
> -- 
> 
> Tim C
> 
> PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
> or at http://members.optushome.com.au/tchur/pubkey.asc
> Key fingerprint = 8C22 BF76 33BA B3B5 1D5B  EB37 7891 46A9 EAF9 93D0
> 
> 



-- 
-------------------------------------------------------------------  
Ole Nielsen                     | Email: Ole.Nielsen at anu.edu.au
-------------------------------------------------------------------  
Mathematical Sciences Institute | Phone: +61 2 6125 3873 (Direct)
Australian National University  | Fax:   +61 2 6125 5549
Canberra ACT 0200               | 
Australia                       | 
-------------------------------------------------------------------  
ANU CRICOS # 00120C
-------------------------------------------------------------------  





More information about the Python-list mailing list