Re: [Python-Dev] Re: test_bsddb blocks testing popitem - reason

Oct. 27, 2003

      On Mon, Oct 27, 2003 at 11:25:16AM +0100, Alex Martelli wrote:
...
I still don't quite see how the lock ends up being "held", but, don't mind
me -- the intricacy of mixins and wrappings and generators and delegations
in those modules is making my head spin anyway, so it's definitely not
surprising that I can't quite see what's going on.
BerkeleyDB internally always grabs a read lock (i believe at the page
level; i don't think BerkeleyDB does record locking) for any database read
when opened with DB_THREAD | DB_INIT_LOCK flags.  I believe the problem
is that a DBCursor object holds this lock as long as it is open/exists.
Other reads can go on happily, but writes must to wait for the read lock
to be released before they can proceed.
...
...
How do python dictionaries deal with modifications to the dictionary
intermixed with iteration?
In general, Python doesn't deal well with modifications to any
iterable in the course of a loop using an iterator on that iterable.
The one kind of "modification during the loop" that does work is:
for k in somedict:
    somedict[k] = ...whatever...
i.e. one can change the values corresponding to keys, but not
change the set of keys in any way -- any changes to the set of
keys can cause unending loops or other such misbehavior (not
deadlocks nor crashes, though...).
However, on a real Python dict,
    k, v = thedict.iteritems().next()
doesn't constitute "a loop" -- the iterator object returned by
the iteritems call is dropped since there are no outstanding
references to it right after this statement.  So, following up
with
    del thedict[k]
is quite all right -- the dictionary isn't being "looped on" at
that time.
What about the behaviour of multiple iterators for the same dict being
used at once (either interleaved or by multiple threads; it shouldn't
matter)?  I expect that works fine in python.

This is something the _DBWithCursor iteration interface does not currently
support due to its use of a single DBCursor internally.

_DBWithCursor is currently written such that the cursor is never closed
once created.  This leaves tons of potential for deadlock even in single
threaded apps.  Reworking _DBWithCursor into a _DBThatUsesCursorsSafely
such that each iterator creates its own cursor in an internal pool
and other non cursor methods that would write to the db destroy all
cursors after saving their current() position so that the iterators can
reopen+reposition them is a solution.
...
Given that in bsddb's case that iteritems() first [and only]
next() boils down to a self.first() which in turn does a 
self.dbc.first() I _still_ don't see exactly what's holding the
lock.  But the simplest fix would appear to be in __delitem__,
i.e., if we have a cursor we should delete through it:
def __delitem__(self, key):
        self._checkOpen()
        if self.dbc is not None:
            self.dbc.set(key)
            self.dbc.delete()
        else:
            del self.db[key]
...but this doesn't in fact remove the deadlock on the
unit-test for popitem, which just confirms I don't really
grasp what's going on, yet!-)
hmm.  i would've expected your __delitem__ to work.  Regardless, using the
debugger I can stop the deadlock from occurring if i do "self.dbc.close();
self.dbc = None" just before popitem's "del self[k]"

Greg

Re: [Python-Dev] Re: test_bsddb blocks testing popitem - reason

Gregory P. Smith