Why does shelve make such large files?
Gerrit Holl
Gerrit.Holl at p98.f112.n480.z2.fidonet.org
Fri Jul 2 10:31:52 EDT 1999
From: Gerrit Holl <gerrit.holl at pobox.com>
On Thu, Jul 01, 1999 at 10:47:57PM +0000, Ovidiu Predescu wrote:
> Gerrit Holl wrote:
>
> > Is it really necesarry for shelve to make such large files?
> > Have a look at this:
> > /tmp> python
> > Python 1.5.2 (#1, Apr 18 1999, 00:16:12) [GCC 2.7.2.3] on linux2
> > Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
> > >>> import shelve
> > >>> d = shelve.open('database')
> > >>> d['key'] = 'value'
> > >>>
> > /tmp> ls -l database
> > -rw-rw-r-- 1 gerrit gerrit 16384 Jul 1 21:13 database
> > ^^^^^
> >
> > 16 KB for only one key!!?
> > pickle seems to make _much_ smaller files!
> >
> > Why is this?
>
> The shelve module uses DBM which is like a small database that allows
<
> you to store objects and search it for a given key. DBM allows you to
<
> store millions of objects and search for them later without requiring
<
> you to load all the file in memory first. <
>
Interesting...
> The pickle module on the other hand is serializing the objects with the
> purpose of deserializing them _all_ from the file later. Pickle does not
> offer you any way to search for data based on a key, you have to do this
> yourself after the objects have been created from the file. This is
> opposed to the way shelve handles this, all the key accesses and
> insertions in a shelve object are actually reads or writes to or from
> the DBM file.
>
> And to answer your question, DBM is creating these big files because of
> the way it manages the database. The data in the database file could
> have gaps as a result of multiple insertions and deletions. Pickle's
> data in files is a simple representation of the objects that were
> written and there is no way to update the file other than rewriting it
> entirely.
>
Ah, I understand.
So pickle is useful for very small datases, but when they're really huge, one
should use shelve. Isn't it?
regards,
Gerrit.
--
The Dutch Linuxgames homepage: http://linuxgames.nl.linux.org
Personal homepage: http://www.nl.linux.org/~gerrit/
Discoverb is a python program (in several languages) which tests the words you
learned by asking it. Homepage: http://www.nl.linux.org/~gerrit/discoverb/
Oh my god! They killed init! You bastards!
More information about the Python-list
mailing list