[Python-bugs-list] whichdb is coded wrong (PR#97)
guido@CNRI.Reston.VA.US
guido@CNRI.Reston.VA.US
Wed, 6 Oct 1999 11:12:15 -0400 (EDT)
> I attempted to reproduce the exercise in Lutz's "Programming Python"
> book using the anydbm module, pp. 39-41. After creating the
> underlying file, I re-open the file using again anydbm and a simple
> program fails with
>
> File "/user/lib/python1.5/anydbm.py, line 83, in open
> raise error, "db type cannot be determined"
> anydbm.error: db type cannot be determined
Apaprently new versions of bsddb have 12 null bytes in front of the
magin number. A patch for whichdbm.py exists in the CVS archives; it
is reproduced here:
Index: whichdb.py
===================================================================
RCS file: /projects/cvsroot/python/dist/src/Lib/whichdb.py,v
retrieving revision 1.4
retrieving revision 1.5
diff -c -r1.4 -r1.5
*** whichdb.py 1998/04/28 15:41:03 1.4
--- whichdb.py 1999/06/08 13:13:16 1.5
***************
*** 31,39 ****
except IOError:
return None
! # Read the first 4 bytes of the file -- the magic number
! s = f.read(4)
f.close()
# Return "" if not at least 4 bytes
if len(s) != 4:
--- 31,40 ----
except IOError:
return None
! # Read the start of the file -- the magic number
! s16 = f.read(16)
f.close()
+ s = s16[0:4]
# Return "" if not at least 4 bytes
if len(s) != 4:
***************
*** 48,53 ****
--- 49,64 ----
# Check for GNU dbm
if magic == 0x13579ace:
return "gdbm"
+
+ # Check for BSD hash
+ if magic in (0x00061561, 0x61150600):
+ return "dbhash"
+
+ # BSD hash v2 has a 12-byte NULL pad in front of the file type
+ try:
+ (magic,) = struct.unpack("=l", s16[-4:])
+ except struct.error:
+ return ""
# Check for BSD hash
if magic in (0x00061561, 0x61150600):
> Instead of staticly coding whichdb, which might fail for various
> distribution/platform types, couldn't you create a generator that for the
> various test cases above generate a "tailored" whichdb for that particular
> distribution/platform?
The whichdbm module wants to be able to tell you the db type even if
you don't have the library code to read it. Hardcoding a list of
magic numbers is a common approach. Often (as you see here) the rules
aren't as simple as "look at the first 4 bytes", and adding a new file
type requires a little bit of thinking. Given the infrequent
appearance of new db types, an automated approach is hardly worth it.
(Prove me wrong by submitting the code :-)
--Guido van Rossum (home page: http://www.python.org/~guido/)