[Python-Dev] Re: Distutils confuses Berkeley DB and dbm?

Skip Montanaro skip@pobox.com
Fri, 31 May 2002 00:23:42 -0500


    Guido> Sounds reasonable to me, but I'm no expert in this area.  (I
    Guido> still have a pending bug report on why whichdb gets confused
    Guido> sometimes.)

I've been looking at that as well.  Whichdb isn't actually very smart and
has a schizophrenic relationship with its input "filename".  It "knows" that
certain types of db libraries build files with certain extensions and it
knows the magic numbers of a few others.  If you ask it what kind of file
"foo" is, it first concatenates certain file extensions to see if it can
tell if it's a dbm or dumbdbm file.  The original dbm (and I think ndbm) C
libraries created "foo.pag" and "foo.dir" when asked to create a "foo" file.

When you use Berkeley DB's dbm-compatible API it still creates a Berkeley DB
file underneath the covers, so the filename sniff tests that would return
"dbm" will fail.  So it's on to the next step.  It then tries to simply open
the filename as given.  This will fail if you call whichdb.whichdb("foo") if
"foo" was created by Berkeley DB, because the actual file is "foo.db":

    >>> import whichdb
    >>> whichdb.whichdb("foo")
    >>> whichdb.whichdb("foo.db")
    'dbhash'

So whichdb is broken.  It tries to treat its argument as both a file prefix
and a file name.  I don't think you can have it both ways.  Fixing it
properly will require one or the other of these interpretations of its input
argument to disappear.

Skip