Why does shelve make such large files?
Thomas S. Strinnhed
Thomas.S..Strinnhed at p98.f112.n480.z2.fidonet.org
Fri Jul 2 13:13:24 EDT 1999
From: "Thomas S. Strinnhed" <thstr at serop.abb.se>
Hi
Ovidiu Predescu wrote:
>[...]
>
> The shelve module uses DBM which is like a small database that allows
> you to store objects and search it for a given key. DBM allows you to
> store millions of objects and search for them later without requiring
> you to load all the file in memory first.
>
> The pickle module on the other hand is serializing the objects with the
> purpose of deserializing them _all_ from the file later. Pickle does not
> offer you any way to search for data based on a key, you have to do this
> yourself after the objects have been created from the file. This is
> opposed to the way shelve handles this, all the key accesses and
> insertions in a shelve object are actually reads or writes to or from
> the DBM file.
>
> And to answer your question, DBM is creating these big files because of
> the way it manages the database. The data in the database file could
> have gaps as a result of multiple insertions and deletions. Pickle's
> data in files is a simple representation of the objects that were
> written and there is no way to update the file other than rewriting it
> entirely.
>
> --
> Ovidiu Predescu <ovidiu at cup.hp.com>
> http://www.geocities.com/SiliconValley/Monitor/7464/
So, just to make shure I follow, in short terms
* pickle makes "persistant objects" flushed to a file
* shelve + dbm is in fact a (relational) database storing
(arbitary**) objects in some hash-searchable order
(**) By arbitary I mean _any_ kind of object, or do they need to be
of the same class?? (In order for searching to work)
About usage: shelves when I need to search and pickle when I just need
to save my objects?? Are there other considerations based on number and
size of objects?
Best regards
-- Thomas S. Strinnhed, thstr at serop.abb.se
More information about the Python-list
mailing list