Shelve operations are very slow and create huge files
Ole Moller Nielsen
ole at cuttlefish.anu.edu.au
Sun Nov 2 22:29:31 EST 2003
Hi Everyone
Yes we used caching of database queries extensively in several projects.
Our datamining pages are available at
ttp://datamining.anu.edu.au/
and the dm software and caching.py at
http://datamining.anu.edu.au/software/dmtools/index.html
Good luck
Ole
On Sun, Nov 02, 2003 at 06:36:00PM +1100, Tim Churches wrote:
> On Sun, 2003-11-02 at 03:38, Eric Wichterich wrote:
> > One script searches the MySQL-database and stores the result, the next
> > script reads the shelve again and processes the result. But there is a
> > problem: if the second script is called too early, the error "(11,
> > 'Resource temporarily unavailable') " occurs.
>
> The only reason to use shelves is if your query results are too large
> (in total) to fit in memory, and thus have to be retrieved, stored and
> processed row-by-row.
>
> > So I took a closer look at the file that is generated by the shelf: The
> > result-list from MySQL-Query contains 14.600 rows with 7 columns. But,
> > the saved file is over 3 MB large and contains over 230.000 lines (!),
> > which seems way too much!
>
> But that doesn't seem to be the case - your query results can easily fit
> in memory. However, the query may still take a long time to execute, so
> it may be reasonable to want to store or cache the results for further
> processing later. However, it is much quicker to just pickle (cPickle)
> the results to a gzipped file than to use shelve. The use of gzip
> actually speeds things up, provided that your CPU is reasonably fast and
> your disc storage system is mundane (any CPU faster than about 500 Mhz
> sees gains on most result sets). Also it saves disc space.
>
> Ole Nielsen and Peter Christen have written a neat set of Python
> functions which will automatically handle the caching of query results
> from an MySQL datadase in gzipped pickles - see
> http://csl.anu.edu.au/ml/dm/dm_software.html - except the files don't
> seem to be available from that page - Ole and Peter, please fix!
>
> --
>
> Tim C
>
> PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
> or at http://members.optushome.com.au/tchur/pubkey.asc
> Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0
>
>
--
-------------------------------------------------------------------
Ole Nielsen | Email: Ole.Nielsen at anu.edu.au
-------------------------------------------------------------------
Mathematical Sciences Institute | Phone: +61 2 6125 3873 (Direct)
Australian National University | Fax: +61 2 6125 5549
Canberra ACT 0200 |
Australia |
-------------------------------------------------------------------
ANU CRICOS # 00120C
-------------------------------------------------------------------
More information about the Python-list
mailing list