default shelve on linux corrupts, does different DB system help?

skip at pobox.com skip at pobox.com
Fri Mar 13 08:03:38 EDT 2009


    Paul> I have the problem that my shelve(s) sometimes corrupt (looks like
    Paul> it has after python has run out of threads).

    Paul> I am using the default shelve so on linux I get the dbhash
    Paul> version.  Is there a different DB type I can choose that is known
    Paul> to be more resilient? And if so, what is the elegant way of doing
    Paul> that?

You don't say what version of Python you're using or what version of the
Berkeley DB library underpins your installation, but I am going to guess it
is 1.85.  This has been known to have serious bugs for over a decade.  (Just
in the hash file implementation.  The btree and recnum formats are ok.
Unfortunately, the hash file implementation is what everybody has always
gravitated to.  Sort of like moths to a flame...)

If that's the case, simply pick some other dbm file format for your shelves,
e.g.:

    >>> import gdbm
    >>> import shelve
    >>> f = gdbm.open("/tmp/trash.db", "c")
    >>> f.close()
    >>> db = shelve.open("/tmp/trash.db")
    >>> db["mike"] = "sharon" 
    >>> db["4"] = 5
    >>> db.keys()
    ['4', 'mike']
    >>> db.close()
    >>> f = gdbm.open("/tmp/trash.db", "c")
    >>> f.keys()
    ['4', 'mike']
    >>> f['4']
    'I5\n.'
    >>> f['mike']
    "S'sharon'\np1\n."

As for "uncorrupting" your existing database, see if your Linux distribution
has a db_recover program.  If it does, you might be able to retrieve your
data, though in the case of BerkDB 1.85's hash file I'm skeptical that can
be done.  I hope you weren't storing something valuable in it like your bank
account passwords.

-- 
Skip Montanaro - skip at pobox.com - http://www.smontanaro.net/



More information about the Python-list mailing list