[Python-Dev] Re: test_bsddb blocks testing popitem - reason

Tue Oct 28 05:12:21 EST 2003

On Monday 27 October 2003 10:56 pm, Gregory P. Smith wrote:
> On Mon, Oct 27, 2003 at 11:25:16AM +0100, Alex Martelli wrote:
> > I still don't quite see how the lock ends up being "held", but, don't
> > mind me -- the intricacy of mixins and wrappings and generators and
> > delegations in those modules is making my head spin anyway, so it's
> > definitely not surprising that I can't quite see what's going on.
>
> BerkeleyDB internally always grabs a read lock (i believe at the page
> level; i don't think BerkeleyDB does record locking) for any database read
> when opened with DB_THREAD | DB_INIT_LOCK flags.  I believe the problem
> is that a DBCursor object holds this lock as long as it is open/exists.
> Other reads can go on happily, but writes must to wait for the read lock
> to be released before they can proceed.

Aha, much clearer, thanks.

> What about the behaviour of multiple iterators for the same dict being
> used at once (either interleaved or by multiple threads; it shouldn't
> matter)?  I expect that works fine in python.

If the dict is not being modified, or if the only modifications on it are
assigning different values for already-existing keys, multiple iterators
on the same unchanging dict do work fine in one or more threads.
But note that iterators only "read" the dict, don't change it.  If any
change to the set of keys in the dict happens, all bets are off.

> This is something the _DBWithCursor iteration interface does not currently
> support due to its use of a single DBCursor internally.
>
> _DBWithCursor is currently written such that the cursor is never closed
> once created.  This leaves tons of potential for deadlock even in single
> threaded apps.  Reworking _DBWithCursor into a _DBThatUsesCursorsSafely
> such that each iterator creates its own cursor in an internal pool
> and other non cursor methods that would write to the db destroy all
> cursors after saving their current() position so that the iterators can
> reopen+reposition them is a solution.

Woof.  I think I understand what you're saying.  However, writing to a
dict (in the sense of changing the sets of keys) while one is iterating
on the dict is NOT supported in Python -- basically "undefined behavior"
(which does NOT include possibilities of crashes and deadlocks, though).
So, maybe, we could get away with something a bit less rich here?

> > Given that in bsddb's case that iteritems() first [and only]
> > next() boils down to a self.first() which in turn does a
> > self.dbc.first() I _still_ don't see exactly what's holding the
> > lock.  But the simplest fix would appear to be in __delitem__,
> > i.e., if we have a cursor we should delete through it:
> >
> >     def __delitem__(self, key):
> >         self._checkOpen()
> >         if self.dbc is not None:
> >             self.dbc.set(key)
> >             self.dbc.delete()
> >         else:
> >             del self.db[key]
> >
> > ...but this doesn't in fact remove the deadlock on the
> > unit-test for popitem, which just confirms I don't really
> > grasp what's going on, yet!-)
>
> hmm.  i would've expected your __delitem__ to work.  Regardless, using the

Ah!  I'll check again -- maybe I did something silly -- but what happens
now is that the __delitem__ DOES work, the key does get deleted according
to print and printf's I've sprinkled here and there, BUT then right after the
key is deleted everything deadlocks anyway (in test_popitem).

> debugger I can stop the deadlock from occurring if i do "self.dbc.close();
> self.dbc = None" just before popitem's "del self[k]"

So, maybe I _should_ just fix popitem that way and see if all tests pass?
I dunno -- it feels a bit like fixing the symptoms and leaving some deep
underlying problems intact...

Any other opinions?  I don't have any strong feelings one way or the
other, except that I really think unit-tests SHOULD pass... and indeed
that changes should not committed UNLESS unit-tests pass...

Alex