[Python-bugs-list] [ python-Bugs-491888 ] whichdb lies about db type
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 13 Aug 2002 01:32:49 -0700
Bugs item #491888, was opened at 2001-12-12 14:22
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=491888&group_id=5470
Category: Python Library
Group: Python 2.1.1
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Richard Jones (richard)
Assigned to: Martin v. Löwis (loewis)
Summary: whichdb lies about db type
Initial Comment:
>>> import dbm
>>> d = dbm.open('foo', 'n')
>>> d['a'] = 'b'
>>> d.close()
>>> import whichdb
>>> whichdb.whichdb('foo.db')
'dbhash'
I'm currently testing for the existence of "foo.db"
instead of "foo" and hard-code my routines to use dbm
if there is a "foo.db" file (since all other db
modules that I've tested do no append ".db")
Might it also be possible to have anydbm perform a
whichdb check in its open function, so that older
databases are usable with newer, more feature-full
installations that might include "better" dbm
backends?
----------------------------------------------------------------------
>Comment By: Richard Jones (richard)
Date: 2002-08-13 18:32
Message:
Logged In: YES
user_id=6405
Sorry, yes, I believe with Skip's fixes this can be closed.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-08-13 18:10
Message:
Logged In: YES
user_id=21627
So can this report be closed then?
----------------------------------------------------------------------
Comment By: Richard Jones (richard)
Date: 2002-08-13 08:06
Message:
Logged In: YES
user_id=6405
I believe your patch has fixes the problem. There were no version
changes, just the ".db" extension which was confusing anydbm. Your
assertion that
dbhash.open('foo', 'r')
should work doesn't fly here:
Python 2.2.1 (#1, Apr 9 2002, 13:10:27)
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-98)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dbm, dbhash, whichdb
>>> dbm.open('spam', 'n')
<dbm.dbm object at 0x815a0f0>
>>> _.close()
>>> whichdb.whichdb('spam')
>>> whichdb.whichdb('spam.db')
'dbhash'
>>> dbhash.open('spam', 'r')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.2/dbhash.py", line 16, in open
return bsddb.hashopen(file, flag, mode)
bsddb.error: (2, 'No such file or directory')
>>> dbhash.open('spam.db', 'r')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.2/dbhash.py", line 16, in open
return bsddb.hashopen(file, flag, mode)
bsddb.error: (22, 'Invalid argument')
>>> dbm.open('spam', 'r')
<dbm.dbm object at 0x815a0f0>
>>>
I don't see how that's "pilot error". Your fixes will solve the problem
though. anydbm will be able to reopen the dbm-created files it creates
:)
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2002-08-13 07:06
Message:
Logged In: YES
user_id=44345
If the user opened the file with
db = anydbm.open("foo", "c")
*and* the dbm module happened to be selected by anydbm *and*
dbmmodule.so happened to be linked with BerkDB, the file created will
be named "foo.db" and will actually be a BerkDB hash file (whose
version depends on the version of the library installed). If the user later
asks whichdb.whichdb what type of file "foo" is, my latest change
corrected responds "dbm". If, on the other hand, the user asks
whichdb.whichdb what type of file "foo.db" is, it should now respond
"dbhash". This is what my recent patch to the whichdb module fixed.
It would be incorrect to try to open "foo.db" with the dbm module.
If a bsddb.error exception is raised, it's almost certainly because the user
upgraded the BerkDB library, but didn't run the tools provided by
Sleepycat to upgrade his or her preexisting files. I don't see how there's
a Python problem here that needs solving. It's simply pilot error. The
best we can do I think is improve the message associated with the
exception which the module raises. (Something like "invalid file format"
instead of simply "invalid argument.)
In my previous note I made a mistake. Instead of
He should have called
dbhash.open('foo', 'r')
as he later demonstrated.
The function call should have been "dbm.open".
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-08-12 07:13
Message:
Logged In: YES
user_id=21627
Skip, I think you misunderstand the complaint. It's not
about the way in which an error message is given, but that
the error message is given at all.
The file is a dbm file, and the dbm module is capable of
opening it, so no error should be reported at all.
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2002-07-26 01:03
Message:
Logged In: YES
user_id=44345
Martin's comment in bug 584409 reminded me that I have a patched
whichdb module which should cure this problem. (At the moment my
dbm module is linked with gdbm, not BerkDB, however, so while I've
tested this in the past, I can't provide you with an interactive
demonstration at the moment.) Note that Richard was forced to do
something for which whichdb was not designed. I believe with this
patch he should be able to once again ask for simply "foo" and not
wonder what extensions the underlying db package add to the files.
I still don't think version information would help here. Richard's tests
are flawed. Berkeley DB only adds ".db" to the end of the file when
using the dbm-compatibility API. He should have called
dbhash.open('foo', 'r')
as he later demonstrated. While somewhat mystifying, the bsddb.error
is more or less correct. We should probably trap that and raise a "file
not found" error or just try a stat() call if the db file is to be opened for
reading.
Assigning to Martin for consideration.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2001-12-13 20:43
Message:
Logged In: YES
user_id=21627
I see. The problem appears to be that your BSDDB
installation, which implements hash version 7, does not
simultaneously support hash version 5 anymore. This
primarily is a problem in the Sleepycat version shipped with
your system (for not supporting old databases), and in glibc
(for not incorporating a newer bsd db). Python can work
around this problem, at best - there might always be DBHASH
files that none of the DB implementations on a system can open.
bsddb should expose version information, like DB_HASHVERSION
and DB_HASHOLDVER (the current and the minimum hash
version). Unfortunately, db_185.h, as used by bsddb.c, do
not provide these constants, and db_185.h cannot be used
simultaneously with db.h. db_185.h exposes a HASHVERSION
constant, but that seems to stay at 2 regardless of the file
version that the compatibility API uses.
The right solution seems to drop support for the DB1 API,
and mandate a DB2-or-better db.h. I'd personally recommend
to integrate pybsddb.sf.net into Python 2.3, adding
portability to BSDDB 2 if necessary (it could be a
build-time decision to build either source module as bsddb).
For the moment, I cannot recommend a good work-around; I see
two options:
- find out magically (by looking at db.h) what hash versions
dbhash will support, then check the version of the hash
file, and refuse to use dbash if the version won't be
supported. Since this requires magic, such code should not
be added to Python, but left to the application.
- catch bsddb.error on dbhash.open, and retry with dbm.open.
This is a heuristic which also shouldn't be added to
Python, but which may be acceptable to the application.
----------------------------------------------------------------------
Comment By: Richard Jones (richard)
Date: 2001-12-13 18:05
Message:
Logged In: YES
user_id=6405
Sorry about the anydbm/whichdb confusion - reading the
source a little closer would have avoided my confusion.
Regardless, there is still a problem that on my system,
dbm files are reported as dbhash, and dbhash can't open
the dbm files...
[richard@co3044991-a tmp]$ python
Python 2.1.1 (#1, Aug 30 2001, 17:36:05)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.61mdk)] on
linux-i386
Type "copyright", "credits" or "license" for more
information.
>>> import dbm
>>> dbm.open('foo','n')
<dbm object at 0x812a0f0>
>>> import dbhash
>>> dbhash.open('bar', 'n')
<bsddb object at 0x812b870>
>>>
>>> import whichdb
>>> whichdb.whichdb('foo.db')
'dbhash'
>>> whichdb.whichdb('bar')
'dbhash'
>>> dbhash.open('foo.db', 'r')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.1/dbhash.py", line 16, in open
return bsddb.hashopen(file, flag, mode)
bsddb.error: (-30990, 'Unknown error 4294936306')
>>> dbhash.open('bar', 'r')
<bsddb object at 0x812ef48>
>>>
[richard@co3044991-a tmp]$ file foo.db
foo.db: Berkeley DB (Hash, version 5, native byte-order)
[richard@co3044991-a tmp]$ file bar
bar: Berkeley DB (Hash, version 7, native byte-order)
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2001-12-13 10:28
Message:
Logged In: YES
user_id=21627
I fail to see the problem altogether. What system are you
on? Why do you think dbm does not create dbhash files? It is
not just that the magic says they are BSDDB DB_HASH files,
they really are of that kind?
Also, which of the APIs (dbm, dbhash) do you consider
"better"? I'd say that dbhash is better, since it builds
upon bsddb. So whichdbm, and anydbm, do use the "better" dbm
backend already?
Where is the bug?
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-12 15:28
Message:
Logged In: YES
user_id=6380
Hm. anydmb *does* use whichdb. The problem seems to be that
the dbm file really *does* look like a BSD hash -- the Unix
file(1) command has the same problem.
But I'm not sure I understand your question. Do you have a
particular patch in mind?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=491888&group_id=5470