Mailman 3 Alternative MemberAdaptors - Mailman-Developers

Feb. 20, 2003

      This message is primary for those of you writing or using alternative
MemberAdaptor implementations (e.g. SQL).
Yesterday I checked in a working implementation of a member adaptor
based on BerkeleyDB.  It seems to greatly improve the memory footprint
for really huge lists, at the cost of greater administrative overhead
(because you have to know how to setup and manage BerkeleyDBs), and
potentially slower performance (I haven't benchmarked it; this is just
based on my experience with BerkeleyDB in general).
I've found a few things during this experience that point to things we
ought to improve.  I don't have a lot of time right now, but I wanted
to put this out there to start the discussion.  I'll quickly mention a
few things.

I wanted to hook into the BDB transaction (txn) machinery, and I
found a convenient hook.  I overloaded MailList.Lock() to include a
txn begin, MailList.Save() to do a txn commit, and MailList.Unlock()
to do a txn abort.  This seems to work well as long as aborting
after committing is harmless (it is in BDB).  I'd like to get
feedback from the SQL folks (or other MemberAdaptor developers) on
whether we need more explicit transaction support or whether the
basically necessary hooks are already there.

To make this work, however, I found I had to change the order of
when the extend.py hook gets run.  Specifically, I needed it to run
/before/ the list is locked in MailList.__init__(), otherwise
locking contructors don't hook into the machinery.  I want to commit
this change but I don't want to break other MemberAdaptors or
extend.py hooks.

We really need to optimize the MemberAdaptor API and the
implementations that use them.  Especially methods that return
lists, e.g. getMembers() and friends.  Right now, everything has to
return a list, but I could do much better by returning iterators,
because I can load my iterator up with a BDB cursor.  This has the
advantage of not requiring the entire member database to be loaded
into memory just to iterate over it.  Unfortunately, too much of the
rest of the code assumes these methods return lists, and while I
started to go down the iterator path, I backed out of it because of
the complexity.
There are other optimizations that would require a bit more
thought.  E.g. the admin's Membership List page seems to require
that the entire member database be iterated over to chunkify and
bucketize.  Fixing this probably requires both changes to the u/i
and changes to the interface.  It also makes life more difficult for
OldStyleMemberships, although BDBMemberAdaptor can probably be
fairly easily elaborated.
I'd like to hear from other member adaptor implementations on their
thoughts here.

I'd love for any BerkeleyDB experts to review the BDBMemberAdaptor
code, especially in some of the choices I've made for creating and
opening the environment.  I had a lot of practical problems with
this part of the code, especially in getting multiple processes to
cooperate reasonably.  Any BerkeleyDB experts out there?  (I'm
fairly happy with the schemas, at least for the current
MemberAdaptor API).

I'm leaning heavily toward having this stuff in Mailman 2.2 and
/not/ porting it to 2.1.x.  Too many changes for a micro release,
although it makes project management more complicated, especially in
merging fixes back into the 2.1.x maintenance branch.  Sigh.

Okay, I'm out of time for today.  Any feedback will be appreciated,
even if I can't respond immediately.
Also, the BerkeleyDB based member adaptor seems to work, but should be
considered experimental.  See the BDBMemberAdaptor.py comments for how
to hook this up to a mailing list.  There is currently no migration
tool from classic member adaptors to BDBMemberAdaptors, although I
intend to write such a beast and run a few of my personal lists on the
code to flesh things out.
Enjoy,
-Barry

Alternative MemberAdaptors

barry＠python.org

Thomas Wouters

[*] 'NSA' is also the reason I'm not as active as I once was... I hate Perl, much more so now than before I actually used it full-time. But, I get to go to PyCon, so I haven't lost my soul completely yet :)

M.-A. Lemburg

Thomas Wouters

[*] 'NSA' is also the reason I'm not as active as I once was... I hate Perl, much more so now than before I actually used it full-time. But, I get to go to PyCon, so I haven't lost my soul completely yet :)

M.-A. Lemburg

tags

participants (3)