[pypy-dev] gdbm

Dan Stromberg drsalists at gmail.com
Sat Nov 27 04:07:49 CET 2010


On Fri, Nov 26, 2010 at 6:51 PM, Dan Stromberg <drsalists at gmail.com> wrote:

>
> On Fri, Nov 26, 2010 at 4:18 PM, Dan Stromberg <drsalists at gmail.com>wrote:
>
>>
>> On Fri, Nov 26, 2010 at 3:12 PM, Antonio Cuni <anto.cuni at gmail.com>wrote:
>>
>>> On 16/11/10 04:30, Dan Stromberg wrote:
>>>
>>>  BTW, it might cause confusion down the road to call something that is
>>>> basically like cpython's bsddb (Berkeley DB) by the name "dbm" in pypy's
>>>> library.  In the cpython standard library, "dbm" is an interface to ndbm
>>>> databases.  These all provide the same dictionary-like interface to
>>>> Python
>>>> programs, but have somewhat different API's to C, and pretty different,
>>>> incompatible on-disk representations.
>>>>
>>>
>>> Hi Dan,
>>> I played a bit (veeeery quickly) with dbm on both pypy and cpython, and
>>> I'm not sure I get what you mean when you say that our dbm.py is equivalent
>>> to cpython's bsddb. E.g., I can create a db on cpython and open it from
>>> pypy, so it seems that the two modules are compatible.
>>>
>>> Moreover, I checked which libraries the links to. On CPython, it links to
>>> libdb-4.8.so:
>>>
>>> viper2 ~ $ ldd /usr/lib/python2.6/lib-dynload/dbm.so
>>>        linux-gate.so.1 =>  (0x00884000)
>>>        libdb-4.8.so => /usr/lib/libdb-4.8.so (0x00110000)
>>>        libpthread.so.0 => /lib/libpthread.so.0 (0x003de000)
>>>        libc.so.6 => /lib/libc.so.6 (0x003f8000)
>>>        /lib/ld-linux.so.2 (0x002e0000)
>>>
>>> the pypy version first tries to open libdb.so, then libdb-4.5.so. I had
>>> to manually modify it to open version 4.8 (I agree that we should find a
>>> more general way to find it), but apart from that what I can see is that it
>>> uses the same underlying wrapper as CPython.
>>>
>>> So, to summarise: could you elaborate a bit more why we should delete
>>> dbm.py from pypy?
>>>
>>> ciao,
>>> Anto
>>>
>>
>>
>> Looks like dbm at the API level: CPython dbm, pypy dbm
>> Looks like dbm on disk: CPython dbm
>> Looks like bsddb at the API level: CPython bsddb
>> Looks like bsddb on disk: CPython bsddb, pypy dbm
>>
>> Don't let the common prefix fool you - libdb is Berkeley DB, while dbm is
>> supposed to be ndbm.
>>
>> http://docs.python.org/library/dbm.html
>>
>> That is, pypy's dbm.py is perfectly self-consistent (other than a couple
>> of likely memory leaks), but if you try to open a database from CPython
>> using pypy's dbm module (or vice-versa), I don't believe it'll work.  EG:
>>
>> $ /usr/local/cpython-2.7/bin/python
>> Python 2.7 (r27:82500, Aug  2 2010, 19:15:05)
>> [GCC 4.4.3] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>> >>> import dbm
>> >>> d = dbm.open('d', 'n')
>> >>> d['a'] = 'b'
>> >>> d.close()
>> >>>
>> benchbox-dstromberg:/tmp/dbm-test i686-pc-linux-gnu 30890 - above cmd done
>> 2010 Fri Nov 26 04:14 PM
>>
>> $ /usr/local/pypy-1.4/bin/pypy
>> Python 2.5.2 (79529, Nov 25 2010, 20:40:03)
>> [PyPy 1.4.0] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>> And now for something completely different: ``casuality violations and
>> flying''
>> >>>> import dbm
>> >>>> d = dbm.open('d', 'r')
>> Traceback (most recent call last):
>>   File "<console>", line 1, in <module>
>>   File "/usr/local/pypy-1.4/lib_pypy/dbm.py", line 172, in open
>>     raise error("Could not open file %s.db" % filename)
>> error: Could not open file d.db
>> >>>>
>>
>> HTH
>>
>> Interesting.  My CPython 2.7 build has:
>
> $ ldd dbm.so
>         linux-gate.so.1 =>  (0x009ed000)
>         libgdbm.so.3 => /usr/lib/libgdbm.so.3 (0x00ed5000)
>         libgdbm_compat.so.3 => /usr/lib/libgdbm_compat.so.3 (0x00269000)
>         libpthread.so.0 => /lib/libpthread.so.0 (0x00df3000)
>         libc.so.6 => /lib/libc.so.6 (0x00425000)
>         /lib/ld-linux.so.2 (0x00b7b000)
> benchbox-dstromberg:/usr/local/cpython-2.7/lib/python2.7/lib-dynload
> i686-pc-linux-gnu 30430 - above cmd done 2010 Fri Nov 26 06:48 PM
>
> ...but http://docs.python.org/library/dbm.html plainly says it should be
> ndbm.
>
> So which is wrong?  The doc, or the module that's picking gdbm or Berkeley
> DB, as it sees fit?
>
>
But then there's this at the URL above:

This module can be used with the “classic” ndbm interface, the BSD DB
compatibility interface, or the GNU GDBM compatibility interface. On Unix,
the *configure* script will attempt to locate the appropriate header file to
simplify building this module.

I suppose that means that if it can't find ndbm (which at one time was hard
due to licensing, but last I heard it'd become readily available), it's free
to pretend it has ndbm using something else.

I'd call that puzzlingly worded - it's not the interface that's changing,
but the backend implementation.  But perhaps dbm.py is free to use Berkeley
DB if it prefers to pretend it can never find ndbm.  And perhaps I shouldn't
have skimmed that page so quickly ^_^

CPython 2.7's configure script has:

  --with-dbmliborder=db1:db2:...
                          order to check db backends for dbm. Valid value is
a
                          colon separated string with the backend names
                          `ndbm', `gdbm' and `bdb'.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20101126/c4a1d226/attachment.html>


More information about the Pypy-dev mailing list