Distutils confuses Berkeley DB and dbm?
Would someone on Linux please try the following: import dbm f = dbm.open("foo", "c") f["1"] = "1" f.close then ask the file command what kind of file it is. On my system it tells me the file is a Berkeley DB 1.85 hash file. I figure that distutils is getting ahold of the libdb dbm-compatibility include files and libraries and using them. On my system, the only ndbm.h is in /usr/include/db1. Here's how the dbm module builds on my system: building 'dbm' extension gcc -DNDEBUG -O3 -fPIC -I. -I/home/skip/src/python/head/dist/src/./Include -I/usr/local/include \ -IInclude/ -c /home/skip/src/python/head/dist/src/Modules/dbmmodule.c -o \ build/temp.linux-i686-2.3/dbmmodule.o gcc -shared build/temp.linux-i686-2.3/dbmmodule.o -L/usr/local/lib -ldb1 \ -o build/lib.linux-i686-2.3/dbm.so I know there is a bug report open about something related to building Berkeley DB-based modules, but I'm offline at the moment so I can't check it. Barry, were you going to look at this? How about a little collaboration? I think I'm mostly responsible for what distutils does as far as building bsddb. Skip
"SM" == Skip Montanaro <skip@pobox.com> writes:
SM> Would someone on Linux please try the following: | import dbm | f = dbm.open("foo", "c") | f["1"] = "1" | f.close SM> then ask the file command what kind of file it is. On my SM> system it tells me the file is a Berkeley DB 1.85 hash file. On my RH6.1-ish system, foo.db is: Berkeley DB 2.X Hash/Little Endian (Version 5) On my stock, but loaded Mandrake 8.2 system, foo.db is: Berkeley DB (Hash, version 5, native byte-order) Note that my main problem with the bsddb module is building it. On the RH6.1 system I get: gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I. -I/home/barry/projects/python/./Include -I/usr/local/include -IInclude/ -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o gcc -shared build/temp.linux-i686-2.3/bsddbmodule.o -L/usr/local/lib -ldb-3.1 -o build/lib.linux-i686-2.3/bsddb.so *** WARNING: renaming "bsddb" since importing it failed: build/lib.linux-i686-2.3/bsddb.so: undefined symbol: dbopen but on the Mandrake 8.2 system I get: gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I/usr/include/db3 -I. -I/home/barry/projects/python/./Include -I/usr/local/include -IInclude/ -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o gcc -shared build/temp.linux-i686-2.3/bsddbmodule.o -L/usr/local/lib -ldb2 -o build/lib.linux-i686-2.3/bsddb.so *** WARNING: renaming "bsddb" since importing it failed: build/lib.linux-i686-2.3/bsddb.so: undefined symbol: __db185_open SM> I know there is a bug report open about something related to SM> building Berkeley DB-based modules, but I'm offline at the SM> moment so I can't check it. Barry, were you going to look at SM> this? How about a little collaboration? I think I'm mostly SM> responsible for what distutils does as far as building bsddb. I'm too tired to think about this more tonight, but if you want to hook up on irc.openprojects.net #python tomorrow, we can try to wade our way through things. -Barry
Skip Montanaro <skip@pobox.com> writes:
Would someone on Linux please try the following:
import dbm f = dbm.open("foo", "c") f["1"] = "1" f.close
then ask the file command what kind of file it is. On my system it tells me the file is a Berkeley DB 1.85 hash file.
On my SuSE 8.0 installation, using /usr/bin/python, it tells me foo.dir: GNU dbm 1.x or ndbm database, little endian foo.pag: GNU dbm 1.x or ndbm database, little endian Regards, Martin
Skip Montanaro wrote:
Would someone on Linux please try the following:
import dbm f = dbm.open("foo", "c") f["1"] = "1" f.close
then ask the file command what kind of file it is. On my system it tells me the file is a Berkeley DB 1.85 hash file. I figure that distutils is getting ahold of the libdb dbm-compatibility include files and libraries and using them. On my system, the only ndbm.h is in /usr/include/db1. Here's how the dbm module builds on my system:
building 'dbm' extension gcc -DNDEBUG -O3 -fPIC -I. -I/home/skip/src/python/head/dist/src/./Include -I/usr/local/include \ -IInclude/ -c /home/skip/src/python/head/dist/src/Modules/dbmmodule.c -o \ build/temp.linux-i686-2.3/dbmmodule.o gcc -shared build/temp.linux-i686-2.3/dbmmodule.o -L/usr/local/lib -ldb1 \ -o build/lib.linux-i686-2.3/dbm.so
I know there is a bug report open about something related to building Berkeley DB-based modules, but I'm offline at the moment so I can't check it. Barry, were you going to look at this? How about a little collaboration? I think I'm mostly responsible for what distutils does as far as building bsddb.
Perhaps you ought to have a look at mx.BeeBase ? It's portable and fast, has locks and transactions. (And it builds on all platforms where egenix-mx-base builds.) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/
mal> Perhaps you ought to have a look at mx.BeeBase ? It's portable and mal> fast, has locks and transactions. (And it builds on all platforms mal> where egenix-mx-base builds.) Perhaps, but that doesn't solve existing problems with building bsddb and the various dbm-compatibility modes available. The bsddb build problem is essentially that some (many? most? all?) Linux distributions ship with multiple versions of the Berkeley DB library now. To make matters worse, they separate the shared libraries (in base rpms) from the include files (in -devel rpms). On my Mandrake system that gives you six possible rpms to install: dbX and dbX-devel, for X in {1,2,3}. Based on the way distutils checks for libraries and include files (which I believe is mostly my fault), if you have only one of any given pair of such rpms, like Barry, you might wind up compiling with one version of the library and trying to link with a different version. I thought you were nominally against the idea of incorporating bits of mx into the core? Skip
Skip Montanaro wrote:
mal> Perhaps you ought to have a look at mx.BeeBase ? It's portable and mal> fast, has locks and transactions. (And it builds on all platforms mal> where egenix-mx-base builds.)
Perhaps, but that doesn't solve existing problems with building bsddb and the various dbm-compatibility modes available.
True, just thought I'd drop in an idea how to get around all the dbm problems.
The bsddb build problem is essentially that some (many? most? all?) Linux distributions ship with multiple versions of the Berkeley DB library now. To make matters worse, they separate the shared libraries (in base rpms) from the include files (in -devel rpms). On my Mandrake system that gives you six possible rpms to install: dbX and dbX-devel, for X in {1,2,3}. Based on the way distutils checks for libraries and include files (which I believe is mostly my fault), if you have only one of any given pair of such rpms, like Barry, you might wind up compiling with one version of the library and trying to link with a different version.
I thought you were nominally against the idea of incorporating bits of mx into the core?
Yep; but that doesn't keep you from using them :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/
"SM" == Skip Montanaro <skip@pobox.com> writes:
SM> The bsddb build problem is essentially that some (many? most? SM> all?) Linux distributions ship with multiple versions of the SM> Berkeley DB library now. To make matters worse, they separate SM> the shared libraries (in base rpms) from the include files (in SM> -devel rpms). On my Mandrake system that gives you six SM> possible rpms to install: dbX and dbX-devel, for X in {1,2,3}. SM> Based on the way distutils checks for libraries and include SM> files (which I believe is mostly my fault), if you have only SM> one of any given pair of such rpms, like Barry, you might wind SM> up compiling with one version of the library and trying to SM> link with a different version. It's worse than that. On my MD8.2 system, I've got at least 4 versions of Berkeley installed: db1-1.85-7mdk + db1-devel-1.85-7mdk db2-2.4.14-5mdk + db2-devel-2.4.14-5mdk libdb3.3-3.3.11-7mdk + libdb3.3-devel-3.3.11-7mdk Those are all installed with a fairly beefy package selection, but nonetheless stock packages. Then I've got Berkeley DB 3.3.11 installed from source sitting in its default installation spot of /usr/local/BerkeleyDB.3.3 and 4.0.14 sitting in /usr/local/BerkeleyDB.4.0 ... sigh! I wouldn't argue if you said it was a mess. ;) So what do we do? I still think that pybsddb is a worthy candidate for inclusion in the standard library and it should link against 3.3.11 out of the box. I believe the cvs snapshot of pybsddb links against 4.0.14 as well. Then there's this bug http://sourceforge.net/tracker/index.php?func=detail&aid=408271&group_id=5470&atid=105470 Maybe we should lock bsddbmodule down to Berkeley 1.85 and then pull pybsddb (which exports as module bsddb3) for Berkeley 3.3.11. If there's a clamor for it, I suppose we could fake a bsddb2 and bsddb4 for those major versions of Berkeley. -Barry
BAW> Maybe we should lock bsddbmodule down to Berkeley 1.85 and then Please don't do that. That creates two problems. On the one hand it will break code for people like me who successfully use db2 or db3. On the other hand, you will force all users of dbhash and bsddb and all users of anydbm who have bsddb installed to use the provably broken hash file implementation. If you want to lock bsddb down to the 1.85 API, force it to only build with db2 or db3 and reject attempts to compile/link it with db1. BAW> pull pybsddb (which exports as module bsddb3) for Berkeley 3.3.11. BAW> If there's a clamor for it, I suppose we could fake a bsddb2 and BAW> bsddb4 for those major versions of Berkeley. Why the proliferation? I can see the argument for incorporating pybsddb into the core because it offers greater functionality, but why incorporate version names into module names that track releases? If Sleepycat's behavior in the past is any indication of its behavior in the future, they will change both the API and file formats in every release and give you various, incompatible tools with which to migrate each time. Skip
On Fri, May 31, 2002 at 09:34:22AM -0500, Skip Montanaro wrote:
but why incorporate version names into module names that track releases?
I would very much to see few different modules for BSDDB - I still use BerkeleyDB 1.85 (to read/write DBs for a closed-source program), and use newer versions for other needs. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
"SM" == Skip Montanaro <skip@pobox.com> writes:
SM> If you want to lock bsddb down to the 1.85 API, force it to SM> only build with db2 or db3 and reject attempts to compile/link SM> it with db1. That's more in line with what I was thinking. BAW> pull pybsddb (which exports as module bsddb3) for Berkeley BAW> 3.3.11. If there's a clamor for it, I suppose we could fake BAW> a bsddb2 and bsddb4 for those major versions of Berkeley. SM> Why the proliferation? I can see the argument for SM> incorporating pybsddb into the core because it offers greater SM> functionality, but why incorporate version names into module SM> names that track releases? If Sleepycat's behavior in the SM> past is any indication of its behavior in the future, they SM> will change both the API and file formats in every release and SM> give you various, incompatible tools with which to migrate SM> each time. The problem seems to be trying to figure out which API is available and which you want to use. You might even want to use more than one in any given Python installation. So it seems reasonable to be explicit about the version of the API you need. I'd still default "bsddb" to the 1.85 API even if that links with a later version of the library/headers (as long as it's consistent, as you suggested). Since pybsddb3 already exports itself as `bsddb3' that seems to make sense too, although how that interacts with Berkeley DB 4.0.14, I'm not sure (maybe that blows my theory). -Barry
Skip Montanaro <skip@pobox.com> writes:
Why the proliferation? I can see the argument for incorporating pybsddb into the core because it offers greater functionality, but why incorporate version names into module names that track releases?
Because any specific version of the Sleepycat code can only access a few database file format versions. Older databases cannot be accessed with newer library implementations. Hence, you need to link explicitly with older libraries to access older databases. Regards, Martin
Skip> Why the proliferation? Martin> Because any specific version of the Sleepycat code can only Martin> access a few database file format versions. Older databases Martin> cannot be accessed with newer library implementations. Hence, Martin> you need to link explicitly with older libraries to access older Martin> databases. I think it's a bad idea to tie into an external vendor's release idiosyncracies that way. Sleepycat does provide utilities with each release which allows users to migrate from older file formats to newer ones. I would rather see some embellishment of the bsddb module that allows programmers to query file format types and possibly convert file formats by calling out to the Sleepycat-provided utilities. Skip
"SM" == Skip Montanaro <skip@pobox.com> writes:
Skip> Why the proliferation? Martin> Because any specific version of the Sleepycat code can Martin> only access a few database file format versions. Older Martin> databases cannot be accessed with newer library Martin> implementations. Hence, you need to link explicitly with Martin> older libraries to access older databases. SM> I think it's a bad idea to tie into an external vendor's SM> release idiosyncracies that way. Sleepycat does provide SM> utilities with each release which allows users to migrate from SM> older file formats to newer ones. I would rather see some SM> embellishment of the bsddb module that allows programmers to SM> query file format types and possibly convert file formats by SM> calling out to the Sleepycat-provided utilities. There's really two issues, the file format and the API version. It probably makes sense to keep `bsddb' as the 1.85 API. Maybe it's enough for pybsddb (aka bsddb3) to be the 3.x API and not expose a 2.x API. But then, what about the BDB 4.x API? I agree that we probably don't need to support mutiple older file formats, since there are tools to upgrade, as long as we fail gracefully when handed a format we don't understand. By that I mean, get an exception, not a crash. -Barry
barry@zope.com (Barry A. Warsaw) writes:
Maybe we should lock bsddbmodule down to Berkeley 1.85 and then pull pybsddb (which exports as module bsddb3) for Berkeley 3.3.11. If there's a clamor for it, I suppose we could fake a bsddb2 and bsddb4 for those major versions of Berkeley.
I completely agree that this should be the strategy. Regards, Martin
participants (5)
-
barry@zope.com
-
M.-A. Lemburg
-
martin@v.loewis.de
-
Oleg Broytmann
-
Skip Montanaro