[Python-bugs-list] [ python-Bugs-584409 ] add way to detect bsddb version

noreply@sourceforge.net noreply@sourceforge.net
Thu, 25 Jul 2002 08:06:37 -0700


Bugs item #584409, was opened at 2002-07-21 02:29
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=584409&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: paul rubin (phr)
Assigned to: Nobody/Anonymous (nobody)
Summary: add way to detect bsddb version

Initial Comment:
The bsddb module docs say that some Python
configurations use Berkeley db 1.85
and others use the incompatible 2.0.  Maybe by now
there are later versions as well.  There's no way
listed for a Python script to know which version of
bsddb is running underneath!  That's not so great,
since the versions don't interoperate and don't support
the same operations.

Proposed fix: please add a new function to the module,
bsddb.db_version().  This would
return a constant string like "1.85" or "2.0", built at
Python configuration time.


----------------------------------------------------------------------

>Comment By: Skip Montanaro (montanaro)
Date: 2002-07-25 10:06

Message:
Logged In: YES 
user_id=44345

I can't comment on #504282 (I don't know what the problem is because 
the poster didn't provide enough information about the files and their 
names).  I attached a patch to #491888 which should solve that 
problem.

still-unconvinced-ly y'rs,

Skip


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-07-25 04:08

Message:
Logged In: YES 
user_id=21627

It would solve bug #491888, and allow to give a better
diagnostic for #504282.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2002-07-24 13:10

Message:
Logged In: YES 
user_id=44345

What would you have it report?  Dbhash is nothing more than a thin 
wrapper around bsddb.  Whichdb is a very fragile beast in my opinion, 
but it does already do some file content introspection, and if the file is 
some sort of Berkeley DB hash file, it will report it more-or-less correctly 
as "dbhash" (more correct in my opinion than returning None or "").  
This includes files created using the dbm module, if that module was 
linked with the dbm emulation API of Berkeley DB.

I still fail to see how any of this detection people propose would help.  If 
you have a version 5 hash file it doesn't matter how positive you are 
about it.  A later version of the Berkeley DB library which expects a 
version 7 hash file is still going to barf on the older file format.  To make 
things work again you're going to have to resort to running Sleepycat's 
tools to convert the file to the proper format.  It's not like you can 
detect file version differences and then plunge ahead along a different 
path without alerting the user to the problem.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-07-24 12:29

Message:
Logged In: YES 
user_id=21627

No, the main point would be that whichdb would not
incorrectly report the file format as 'dbhash', when it
isn't (because dbhash supports a different version).

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2002-07-24 09:54

Message:
Logged In: YES 
user_id=44345

This is precisely what Sleepycat's db_dump/db_load type tools take care 
of.  It's a one-time thing.  When you upgrade from one version of 
Berkeley DB to another you need to run these tools to make sure the file 
formats are up-to-date.  The only problem I see here with the current 
code is that the exception which is raised is rather mystical - something 
like a very large number followed by "invalid argument".  The most 
significant change I would see making here is to have the bsddb module 
recognize that weird error and raise an exception with a saner message.

I can't see the programmer or the user getting more information out of 
"expected hash file format version 7 but got hash file format version 5".


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2002-07-24 08:24

Message:
Logged In: YES 
user_id=12800

We could probably write a little utility to sniff file
version numbers based on the magic number as given in this doco:

http://www.sleepycat.com/docs/ref/install/magic.txt

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-07-24 02:34

Message:
Logged In: YES 
user_id=21627

There is a bug report (somewhere) that whichdb incorrectly
determines the DB module. In that case, whichdb would
correctly find out that this is a Sleepycat database, and
suggest to use dbhash. In turn, dbhash would fail to open
the file, because the file version was incorrect. It would
have been correct to use the dbm module, since the dbm
library was also based on Sleepycat, but had a different
version than the bsddb library installed on the same system.

This problem can be solved if you can find out what file
version(s) your bsddb module supports.

The library version seems less useful to me, indeed.

----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2002-07-23 21:53

Message:
Logged In: YES 
user_id=12800

It's useful if for no other reason than to figure out which
bugs you need to work around <wink>.

BTW, PyBSDDB does give you the ability to find out both the
version of the wrapper you've got and the version of the
underlying library.:

>>> import bsddb3
>>> bsddb3.__version__
'3.3.0'
>>> bsddb3._db.version()
(3, 3, 11)


You've also got DB_VERSION_STRING, DB_VERSION_MAJOR and
DB_VERSION_MINOR.

Note that if you're linking against a newer version of the
library using the 1.85 API, *that* might be a difficult
thing to figure out.  Off hand (and I can't check right
now), I don't know if that would give yo a different
bsddb3._db version constant or would otherwise be detectable.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2002-07-23 19:15

Message:
Logged In: YES 
user_id=44345

I agree, if it's wanted badly enough, we can figure out what version was 
linked with the module code. The "define macros at configure time" idea 
is possible.  The "create a database and peek at it" idea won't work 
though.  There are library version numbers and file versions.  They don't 
always change in sync.

Like I said before, I'm skeptical a Python script would really need to 
know what version of the underlying library was linked with 
bsddbmodule.o.  Can you motivate things with a use case?

Skip


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2002-07-23 18:35

Message:
Logged In: NO 

How can it be "impossible" to find out?  The build script
for the bsddb module can check what version is being linked,
and include a string reachable from Python.

At worst, there could be a routine added to the module that
actually creates a database, then examines the db file and
figures out from the bytes inside which version it is.

Paul


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-07-23 16:50

Message:
Logged In: YES 
user_id=21627

I also believe that this problem should be fixed by importing 
pybsddb3.

On this issue itself: it turns out impossible to find out, 
programmatically, what version of Sleepycat DB you are 
running if all you have is the compatibility API: both the 
compile-time and the run-time version information is not 
available. Furthermore, you cannot include both new and old 
headers, since they conflict. So given the current code base, 
this problem cannot be solved.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2002-07-22 20:18

Message:
Logged In: YES 
user_id=44345

Sorry for the lack of clarity.  What I should have said is that the code 
which implements the bsddb extension module only calls the 
1.85-compatible C API exposed when you configure the Berkeley DB 
code using the --enable-compat185 flag.  All the wonders and mysteries 
of the later parts of the API are lost on the bsddb code.

There are two levels of compatibility, the API level and the file format 
level.  All users of the bsddb module should care about is the file format 
level compatibility and handling that is a one-time problem dealt with 
using tools provided by Sleepycat as part of their distribution.

The topic of including bsddb3 in the standard distribution has been 
discussed before.  For one example, see:

  http://mail.python.org/pipermail/python-dev/2002-January/019261.html

I think the main stumbling block to incorporation is that it only works 
with versions 3 and 4 of the Berkeley DB library.  There is a more recent 
thread that currently escapes my feeble attempts to find it.



----------------------------------------------------------------------

Comment By: paul rubin (phr)
Date: 2002-07-22 15:26

Message:
Logged In: YES 
user_id=72053

OK, it looks like both the docs and Skip's note are a bit
unclear.  When you say only the 1.85 API is exposed, does
that mean the 1.85 file format is also used either way?  In
particular, if Python is linked with Berkeley DB 2.0 and I
create a db with it, will that db interoperate with another
application that's linked to Berkeley DB 1.85?

If it won't interoperate, then it's definitely worthwhile to
add some kind of call to the Python bsddb module to let
Python scripts find out which file format they're dealing
with.  

Also, I didn't realize only the 1.85 API was supported.  I
hope pybsddb3 can become part of the standard Python
distribution, since I'd like to use Sleepycat's transaction
features from Python scripts.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2002-07-22 07:31

Message:
Logged In: YES 
user_id=44345

This is an interesting idea, but one that I think is less useful than you 
might believe.  The bsddb module exposes the same API based on the 
1.85 C API regardless what version of Berkeley DB you link with.  (I 
have linked it with versions 1.85 through 4.something.)  I've been using 
the bsddb module since its inclusion in Python and have never actually 
cared what version of the underlying C API the module what linked with. 
Someone programming to the C API *would* care about version 
differences, because the C API has grown richer over the years.  The 
bsddb module code just hasn't ever used any new functionality.  Note 
that the pybsddb3 module does use the new functionality in the version 
3 and 4 APIs.

What changes on you between versions are the file formats, and you 
should only care about that at the point where you upgrade from one 
version of Berkeley DB to another.  (Generally, you realize this when you 
start getting errors trying to open old databases.)  Sleepycat provides 
command line tools to help you convert from one file version to another, 
so once you realize your file formats have changed, you wind up poking 
around your disk looking for old format Berkeley DB files, run the tools 
on them, then go back to more interesting things, like writing 
stable sorts. ;-)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=584409&group_id=5470