Multithread ZODB.

Thu Feb 22 21:02:01 EST 2001

"Alexander Semenov" <sav at ulmen.mv.ru> writes:

> Can someone give me a python code snippet, or skeleton of multithread
> program which uses ZODB? Multithread examples are missing from ZODB
> guide. It says what I must create connection for each thread, but I don't
> know
> for what? Should I make root objects separately for each thread? I prefer to
> have common object for all threads.

We had a similar goal and had actually written a multi-threaded
application using ZODB before it really sinking in that not only does
it support separate connections for each thread, but its current
implementation makes it hard to avoid (the default get_transaction()
is implicitly thread-specific).  We also just wanted a single set of
persistent objects, as we were maintaining our own controlled access
to them amongst the threads - e.g., a typical shared state object
setup that we just wanted to overlay on a persistent object store.

In the end, it turned out that as long as we were committing changes
right as they were being made, and in the thread that made them,
everything worked with a single set of objects.  We did have I think
two instances where it appeared that we were losing data that were
tracked back to not committing in the thread making the change - one
of the code paths had skipped the get_transaction().commit() call.  It
was actually the first of these cases that highlighted the
multi-connection dependency noted above.

At the time, I did get some e-mail feedback from Jim Fulton in this
regard (which I hope he doesn't mind my reproducing here - it's just
some general stuff) - in particular (">" is my note to which he was
responding):

  "> ZODB3 is documented as supporting multiple connections to a storage
   > which supports independent object caches for multiple threads.
   > However, although I've looked around at a bunch of documents, I can't
   > seem to find if this is required or simply available. 

   It isn't required, but it is by far the common usage pattern.

   Hm, I guess that the current implementation actually makes it
   hard to institute a different policy.

   > That is, if I'm
   > using ZODB in a multi-threaded Python application, must I create
   > separate connections for each thread, or can they share a single
   > connection.

   Multiple threads *can* share a single connection, but
   if you do this, you'll need to:

     - Perform whatever locking is required to serialize
       access to the shared objects,

     - You'll need to override get_transaction with a version
       that doesn't manage transactions by thread. "

We've been running our application for several months now without
actually have taken the step of replacing get_transaction() yet, as
long as we stick to the commit right away in the thread that made the
change before another thread goes to change the object again :-)

In terms of why it needs to be replaced (presumably to remove that
commit behavior restriction), Jim said:

  "The logic for the Persistent mix-in class calls get_transaction
   to get the current transaction when it needs to
   register that an object has changed.  The standard get_transaction
   manages transaction objects per thread.  In addition, the
   function is cached the first time it is called, to avoid
   global lookups each time. 

   You can replace get_transacton by installing a different version
   in builtins, but you will need to do so immediately after importing
   ZODB, to make sure your version gets cached."

We've been intending to test reworking it with separate connections,
but haven't had a chance yet.  The biggest wrinkle I imagined in that
is that we share object references among threads, and I wasn't sure
how the references from the different connections would work.

In our application, we serialize our access to the set of persistent
objects amongst all threads (effectively using a global lock on the
root object that is obtained whenever a thread is going to be playing
with state data), and any work a thread does on those objects is
committed before the lock is released.  So there's no chance of
collision with two threads trying to update/commit the same persistent
object at a time.  We also used ZODB strictly for persistence (no
versioning or transactions).

But within that structure, it's worked extremely well and has proven
very reliable and robust.

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/