
In a message of Tue, 16 Nov 2010 02:22:56 +0100, "Amaury Forgeot d'Arc" writes:
Hi,
2010/11/16 Dan Stromberg drsalists@gmail.com:
I've put together a gdbm wrapper module using ctypes, and tested agains
t
pypy with the included test_gdbm. Is there interest in adding it to pyp
y?
It's at http://stromberg.dnsalias.org/svn/gdbm-ctypes/trunk/ if anyone
wants
to look it over.
I don't know exactly what gdbm is, but it looks very similar to dbm.py already in pypy: http://codespeak.net/svn/pypy/trunk/lib_pypy/dbm.py
Both gdbm and dbm are part of the python standard library, so I'd guess we'd better have both of them.
http://docs.python.org/library/gdbm.html
Laura
Amaury Forgeot d'Arc

On Mon, Nov 15, 2010 at 6:22 PM, Laura Creighton lac@openend.se wrote:
In a message of Tue, 16 Nov 2010 02:22:56 +0100, "Amaury Forgeot d'Arc" writes:
Hi,
2010/11/16 Dan Stromberg drsalists@gmail.com:
I've put together a gdbm wrapper module using ctypes, and tested agains
t
pypy with the included test_gdbm. Is there interest in adding it to pyp
y?
It's at http://stromberg.dnsalias.org/svn/gdbm-ctypes/trunk/ if anyone
wants
to look it over.
I don't know exactly what gdbm is, but it looks very similar to dbm.py already in pypy: http://codespeak.net/svn/pypy/trunk/lib_pypy/dbm.py
Both gdbm and dbm are part of the python standard library, so I'd guess we'd better have both of them.
http://docs.python.org/library/gdbm.html
Laura
Sounds good to me :)
BTW, it might cause confusion down the road to call something that is basically like cpython's bsddb (Berkeley DB) by the name "dbm" in pypy's library. In the cpython standard library, "dbm" is an interface to ndbm databases. These all provide the same dictionary-like interface to Python programs, but have somewhat different API's to C, and pretty different, incompatible on-disk representations.

On 16/11/10 04:30, Dan Stromberg wrote:
BTW, it might cause confusion down the road to call something that is basically like cpython's bsddb (Berkeley DB) by the name "dbm" in pypy's library. In the cpython standard library, "dbm" is an interface to ndbm databases. These all provide the same dictionary-like interface to Python programs, but have somewhat different API's to C, and pretty different, incompatible on-disk representations.
Hi Dan, I played a bit (veeeery quickly) with dbm on both pypy and cpython, and I'm not sure I get what you mean when you say that our dbm.py is equivalent to cpython's bsddb. E.g., I can create a db on cpython and open it from pypy, so it seems that the two modules are compatible.
Moreover, I checked which libraries the links to. On CPython, it links to libdb-4.8.so:
viper2 ~ $ ldd /usr/lib/python2.6/lib-dynload/dbm.so linux-gate.so.1 => (0x00884000) libdb-4.8.so => /usr/lib/libdb-4.8.so (0x00110000) libpthread.so.0 => /lib/libpthread.so.0 (0x003de000) libc.so.6 => /lib/libc.so.6 (0x003f8000) /lib/ld-linux.so.2 (0x002e0000)
the pypy version first tries to open libdb.so, then libdb-4.5.so. I had to manually modify it to open version 4.8 (I agree that we should find a more general way to find it), but apart from that what I can see is that it uses the same underlying wrapper as CPython.
So, to summarise: could you elaborate a bit more why we should delete dbm.py from pypy?
ciao, Anto

On Fri, Nov 26, 2010 at 3:12 PM, Antonio Cuni anto.cuni@gmail.com wrote:
On 16/11/10 04:30, Dan Stromberg wrote:
BTW, it might cause confusion down the road to call something that is
basically like cpython's bsddb (Berkeley DB) by the name "dbm" in pypy's library. In the cpython standard library, "dbm" is an interface to ndbm databases. These all provide the same dictionary-like interface to Python programs, but have somewhat different API's to C, and pretty different, incompatible on-disk representations.
Hi Dan, I played a bit (veeeery quickly) with dbm on both pypy and cpython, and I'm not sure I get what you mean when you say that our dbm.py is equivalent to cpython's bsddb. E.g., I can create a db on cpython and open it from pypy, so it seems that the two modules are compatible.
Moreover, I checked which libraries the links to. On CPython, it links to libdb-4.8.so:
viper2 ~ $ ldd /usr/lib/python2.6/lib-dynload/dbm.so linux-gate.so.1 => (0x00884000) libdb-4.8.so => /usr/lib/libdb-4.8.so (0x00110000) libpthread.so.0 => /lib/libpthread.so.0 (0x003de000) libc.so.6 => /lib/libc.so.6 (0x003f8000) /lib/ld-linux.so.2 (0x002e0000)
the pypy version first tries to open libdb.so, then libdb-4.5.so. I had to manually modify it to open version 4.8 (I agree that we should find a more general way to find it), but apart from that what I can see is that it uses the same underlying wrapper as CPython.
So, to summarise: could you elaborate a bit more why we should delete dbm.py from pypy?
ciao, Anto
Looks like dbm at the API level: CPython dbm, pypy dbm Looks like dbm on disk: CPython dbm Looks like bsddb at the API level: CPython bsddb Looks like bsddb on disk: CPython bsddb, pypy dbm
Don't let the common prefix fool you - libdb is Berkeley DB, while dbm is supposed to be ndbm.
http://docs.python.org/library/dbm.html
That is, pypy's dbm.py is perfectly self-consistent (other than a couple of likely memory leaks), but if you try to open a database from CPython using pypy's dbm module (or vice-versa), I don't believe it'll work. EG:
$ /usr/local/cpython-2.7/bin/python Python 2.7 (r27:82500, Aug 2 2010, 19:15:05) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import dbm d = dbm.open('d', 'n') d['a'] = 'b' d.close()
benchbox-dstromberg:/tmp/dbm-test i686-pc-linux-gnu 30890 - above cmd done 2010 Fri Nov 26 04:14 PM
$ /usr/local/pypy-1.4/bin/pypy Python 2.5.2 (79529, Nov 25 2010, 20:40:03) [PyPy 1.4.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. And now for something completely different: ``casuality violations and flying''
import dbm d = dbm.open('d', 'r')
Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/local/pypy-1.4/lib_pypy/dbm.py", line 172, in open raise error("Could not open file %s.db" % filename) error: Could not open file d.db
HTH

On Fri, Nov 26, 2010 at 4:18 PM, Dan Stromberg drsalists@gmail.com wrote:
On Fri, Nov 26, 2010 at 3:12 PM, Antonio Cuni anto.cuni@gmail.com wrote:
On 16/11/10 04:30, Dan Stromberg wrote:
BTW, it might cause confusion down the road to call something that is
basically like cpython's bsddb (Berkeley DB) by the name "dbm" in pypy's library. In the cpython standard library, "dbm" is an interface to ndbm databases. These all provide the same dictionary-like interface to Python programs, but have somewhat different API's to C, and pretty different, incompatible on-disk representations.
Hi Dan, I played a bit (veeeery quickly) with dbm on both pypy and cpython, and I'm not sure I get what you mean when you say that our dbm.py is equivalent to cpython's bsddb. E.g., I can create a db on cpython and open it from pypy, so it seems that the two modules are compatible.
Moreover, I checked which libraries the links to. On CPython, it links to libdb-4.8.so:
viper2 ~ $ ldd /usr/lib/python2.6/lib-dynload/dbm.so linux-gate.so.1 => (0x00884000) libdb-4.8.so => /usr/lib/libdb-4.8.so (0x00110000) libpthread.so.0 => /lib/libpthread.so.0 (0x003de000) libc.so.6 => /lib/libc.so.6 (0x003f8000) /lib/ld-linux.so.2 (0x002e0000)
the pypy version first tries to open libdb.so, then libdb-4.5.so. I had to manually modify it to open version 4.8 (I agree that we should find a more general way to find it), but apart from that what I can see is that it uses the same underlying wrapper as CPython.
So, to summarise: could you elaborate a bit more why we should delete dbm.py from pypy?
ciao, Anto
Looks like dbm at the API level: CPython dbm, pypy dbm Looks like dbm on disk: CPython dbm Looks like bsddb at the API level: CPython bsddb Looks like bsddb on disk: CPython bsddb, pypy dbm
Don't let the common prefix fool you - libdb is Berkeley DB, while dbm is supposed to be ndbm.
http://docs.python.org/library/dbm.html
That is, pypy's dbm.py is perfectly self-consistent (other than a couple of likely memory leaks), but if you try to open a database from CPython using pypy's dbm module (or vice-versa), I don't believe it'll work. EG:
$ /usr/local/cpython-2.7/bin/python Python 2.7 (r27:82500, Aug 2 2010, 19:15:05) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import dbm d = dbm.open('d', 'n') d['a'] = 'b' d.close()
benchbox-dstromberg:/tmp/dbm-test i686-pc-linux-gnu 30890 - above cmd done 2010 Fri Nov 26 04:14 PM
$ /usr/local/pypy-1.4/bin/pypy Python 2.5.2 (79529, Nov 25 2010, 20:40:03) [PyPy 1.4.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. And now for something completely different: ``casuality violations and flying''
import dbm d = dbm.open('d', 'r')
Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/local/pypy-1.4/lib_pypy/dbm.py", line 172, in open raise error("Could not open file %s.db" % filename) error: Could not open file d.db
HTH
Interesting. My CPython 2.7 build has:
$ ldd dbm.so linux-gate.so.1 => (0x009ed000) libgdbm.so.3 => /usr/lib/libgdbm.so.3 (0x00ed5000) libgdbm_compat.so.3 => /usr/lib/libgdbm_compat.so.3 (0x00269000) libpthread.so.0 => /lib/libpthread.so.0 (0x00df3000) libc.so.6 => /lib/libc.so.6 (0x00425000) /lib/ld-linux.so.2 (0x00b7b000) benchbox-dstromberg:/usr/local/cpython-2.7/lib/python2.7/lib-dynload i686-pc-linux-gnu 30430 - above cmd done 2010 Fri Nov 26 06:48 PM
...but http://docs.python.org/library/dbm.html plainly says it should be ndbm.
So which is wrong? The doc, or the module that's picking gdbm or Berkeley DB, as it sees fit?

On Fri, Nov 26, 2010 at 6:51 PM, Dan Stromberg drsalists@gmail.com wrote:
On Fri, Nov 26, 2010 at 4:18 PM, Dan Stromberg drsalists@gmail.comwrote:
On Fri, Nov 26, 2010 at 3:12 PM, Antonio Cuni anto.cuni@gmail.comwrote:
On 16/11/10 04:30, Dan Stromberg wrote:
BTW, it might cause confusion down the road to call something that is
basically like cpython's bsddb (Berkeley DB) by the name "dbm" in pypy's library. In the cpython standard library, "dbm" is an interface to ndbm databases. These all provide the same dictionary-like interface to Python programs, but have somewhat different API's to C, and pretty different, incompatible on-disk representations.
Hi Dan, I played a bit (veeeery quickly) with dbm on both pypy and cpython, and I'm not sure I get what you mean when you say that our dbm.py is equivalent to cpython's bsddb. E.g., I can create a db on cpython and open it from pypy, so it seems that the two modules are compatible.
Moreover, I checked which libraries the links to. On CPython, it links to libdb-4.8.so:
viper2 ~ $ ldd /usr/lib/python2.6/lib-dynload/dbm.so linux-gate.so.1 => (0x00884000) libdb-4.8.so => /usr/lib/libdb-4.8.so (0x00110000) libpthread.so.0 => /lib/libpthread.so.0 (0x003de000) libc.so.6 => /lib/libc.so.6 (0x003f8000) /lib/ld-linux.so.2 (0x002e0000)
the pypy version first tries to open libdb.so, then libdb-4.5.so. I had to manually modify it to open version 4.8 (I agree that we should find a more general way to find it), but apart from that what I can see is that it uses the same underlying wrapper as CPython.
So, to summarise: could you elaborate a bit more why we should delete dbm.py from pypy?
ciao, Anto
Looks like dbm at the API level: CPython dbm, pypy dbm Looks like dbm on disk: CPython dbm Looks like bsddb at the API level: CPython bsddb Looks like bsddb on disk: CPython bsddb, pypy dbm
Don't let the common prefix fool you - libdb is Berkeley DB, while dbm is supposed to be ndbm.
http://docs.python.org/library/dbm.html
That is, pypy's dbm.py is perfectly self-consistent (other than a couple of likely memory leaks), but if you try to open a database from CPython using pypy's dbm module (or vice-versa), I don't believe it'll work. EG:
$ /usr/local/cpython-2.7/bin/python Python 2.7 (r27:82500, Aug 2 2010, 19:15:05) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import dbm d = dbm.open('d', 'n') d['a'] = 'b' d.close()
benchbox-dstromberg:/tmp/dbm-test i686-pc-linux-gnu 30890 - above cmd done 2010 Fri Nov 26 04:14 PM
$ /usr/local/pypy-1.4/bin/pypy Python 2.5.2 (79529, Nov 25 2010, 20:40:03) [PyPy 1.4.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. And now for something completely different: ``casuality violations and flying''
import dbm d = dbm.open('d', 'r')
Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/local/pypy-1.4/lib_pypy/dbm.py", line 172, in open raise error("Could not open file %s.db" % filename) error: Could not open file d.db
HTH
Interesting. My CPython 2.7 build has:
$ ldd dbm.so linux-gate.so.1 => (0x009ed000) libgdbm.so.3 => /usr/lib/libgdbm.so.3 (0x00ed5000) libgdbm_compat.so.3 => /usr/lib/libgdbm_compat.so.3 (0x00269000) libpthread.so.0 => /lib/libpthread.so.0 (0x00df3000) libc.so.6 => /lib/libc.so.6 (0x00425000) /lib/ld-linux.so.2 (0x00b7b000) benchbox-dstromberg:/usr/local/cpython-2.7/lib/python2.7/lib-dynload i686-pc-linux-gnu 30430 - above cmd done 2010 Fri Nov 26 06:48 PM
...but http://docs.python.org/library/dbm.html plainly says it should be ndbm.
So which is wrong? The doc, or the module that's picking gdbm or Berkeley DB, as it sees fit?
But then there's this at the URL above:
This module can be used with the “classic” ndbm interface, the BSD DB compatibility interface, or the GNU GDBM compatibility interface. On Unix, the *configure* script will attempt to locate the appropriate header file to simplify building this module.
I suppose that means that if it can't find ndbm (which at one time was hard due to licensing, but last I heard it'd become readily available), it's free to pretend it has ndbm using something else.
I'd call that puzzlingly worded - it's not the interface that's changing, but the backend implementation. But perhaps dbm.py is free to use Berkeley DB if it prefers to pretend it can never find ndbm. And perhaps I shouldn't have skimmed that page so quickly ^_^
CPython 2.7's configure script has:
--with-dbmliborder=db1:db2:... order to check db backends for dbm. Valid value is a colon separated string with the backend names `ndbm', `gdbm' and `bdb'.

On 27/11/10 04:07, Dan Stromberg wrote:
This module can be used with the “classic” ndbm interface, the BSD DB compatibility interface, or the GNU GDBM compatibility interface. On Unix, the *configure* script will attempt to locate the appropriate header file to simplify building this module.
I suppose that means that if it can't find ndbm (which at one time was hard due to licensing, but last I heard it'd become readily available), it's free to pretend it has ndbm using something else.
I'd call that puzzlingly worded - it's not the interface that's changing, but the backend implementation. But perhaps dbm.py is free to use Berkeley DB if it prefers to pretend it can never find ndbm. And perhaps I shouldn't have skimmed that page so quickly ^_^
CPython 2.7's configure script has:
--with-dbmliborder=db1:db2:... order to check db backends for dbm. Valid value is a colon separated string with the backend names `ndbm', `gdbm' and `bdb'.
so, having a dbm.py which links to libdb is fine, and it's also what you get with cpython on ubuntu. There is still the issue of how to find the correct library name, as it seems it can vary (it was db-4.5 when the module was written, it's db-4.8 nowadays), but this is a bit orthogonal to what we are discussing here.
To summarize, I think we should keep the current dbm.py to link against libdb, and integrate your gdbm.py to link against libgdbm. But before merging it to trunk, I'd like to solve the issue of code duplication between the two modules.
ciao, Anto

On Mon, Nov 29, 2010 at 1:00 AM, Antonio Cuni anto.cuni@gmail.com wrote:
On 27/11/10 04:07, Dan Stromberg wrote:
This module can be used with the “classic” ndbm interface, the BSD DB compatibility interface, or the GNU GDBM compatibility interface. On
Unix, the
*configure* script will attempt to locate the appropriate header file to simplify building this module.
I suppose that means that if it can't find ndbm (which at one time was
hard
due to licensing, but last I heard it'd become readily available), it's
free
to pretend it has ndbm using something else.
I'd call that puzzlingly worded - it's not the interface that's changing,
but
the backend implementation. But perhaps dbm.py is free to use Berkeley
DB if
it prefers to pretend it can never find ndbm. And perhaps I shouldn't
have
skimmed that page so quickly ^_^
CPython 2.7's configure script has:
--with-dbmliborder=db1:db2:... order to check db backends for dbm. Valid value
is a
colon separated string with the backend names `ndbm', `gdbm' and `bdb'.
so, having a dbm.py which links to libdb is fine, and it's also what you get with cpython on ubuntu. There is still the issue of how to find the correct library name, as it seems it can vary (it was db-4.5 when the module was written, it's db-4.8 nowadays), but this is a bit orthogonal to what we are discussing here.
To summarize, I think we should keep the current dbm.py to link against libdb, and integrate your gdbm.py to link against libgdbm. But before merging it to trunk, I'd like to solve the issue of code duplication between the two modules.
ciao, Anto
Agreed.
I should have time for this sometime this week or the next.
participants (3)
-
Antonio Cuni
-
Dan Stromberg
-
Laura Creighton