[Python-3000] Nonlinearity in dbm.ndbm?

Sun Sep 7 05:58:20 CEST 2008

On Sat, Sep 6, 2008 at 8:37 PM,  <skip at pobox.com> wrote:
>
>    Josiah> The version I just posted to the tracker reads/writes about 30k
>    Josiah> entries/second.  You may want to look at the differences (looks
>    Josiah> to be due to your lack of a primary key/index).
>
>    me> Thanks.  The real speedup was to avoid using cursors.
>
> Let me take another stab at this.  My __setitem__ looks like this:
>
>    def __setitem__(self, key, val):
>        c = self._conn.cursor()
>        c.execute("replace into dict"
>                  " (key, value) values (?, ?)", (key, val))
>        self._conn.commit()
>
> This works (tests pass), but is slow (23-25 msec per loop).  If I change it
> to this:
>
>    def __setitem__(self, key, val):
>        self._conn.execute("replace into dict"
>                           " (key, value) values (?, ?)", (key, val))
>
> which is essentially your __setitem__ without the type checks on the key and
> value, it runs much faster (about 300 usec per loop), but the unit tests
> fail.  This also works:
>
>    def __setitem__(self, key, val):
>        self._conn.execute("replace into dict"
>                           " (key, value) values (?, ?)", (key, val))
>        self._conn.commit()
>
> I think you need the commits and have to suffer with the speed penalty.

I guess I need to look at your unittests, because in my testing,
reading/writing with a single instance works great, but if you want
changes to be seen by other instances (in other threads or processes),
you need to .commit() changes.  I'm thinking that that's a reasonable
expectation; I never expected bsddbs to be able to share their data
with other processes until I did a .sync(), but maybe I never expected
much from my dbm-like interfaces?

 - Josiah