Re: [Mailman-Developers] Alternative MemberAdaptors

4 Mar 2003

      On Thu, Feb 20, 2003 at 10:55:52AM -0500, barry@python.org wrote:
[ MemberAdaptors ]
...
I've found a few things during this experience that point to things we
ought to improve.  I don't have a lot of time right now, but I wanted
to put this out there to start the discussion.  I'll quickly mention a
few things.
I haven't looked at MemberAdaptors in any level of detail yet, but I do
intend to write one or two member adaptors for our internal company lists.
One is a straight (My)SQL one that takes data from a simple set of tables,
one of which is specifically for Mailman (and thus easy to change.)
The other would be a much more complex adaptor that hooks right into our
company database ('NSA', PostgreSQL with an OO interface library written in
Perl[*]) using XML-RPC. I already use XML-RPC from Python to test and twiddle
with so much more ease than from Perl or PHP, so I'm sure that's not going
to be an issue. The main reason for using the XML-RPC interface is, however,
to be able to access all the email aliasses the list subscribers have. The
company-internal lists are strictly controlled, and every day or so someone
will post a message from their new funky alias, which will be held. I
thought Mailman 2.1 was going to have a mechanism to avoid that (a listing
of 'these emailaddresses are also me' in the options page) but that may have
been a dream. In any case, I can add that :)
I'm also not sure whether I really want a full MemberAdaptor for the XML-RPC
case, or a mix of another backend and XML-RPC. Anyway, neither
implementation would be transactional (and the MySQL server is 3.x.)
...

I wanted to hook into the BDB transaction (txn) machinery, and I
found a convenient hook.  I overloaded MailList.Lock() to include a
txn begin, MailList.Save() to do a txn commit, and MailList.Unlock()
to do a txn abort.  This seems to work well as long as aborting
after committing is harmless (it is in BDB).  I'd like to get
feedback from the SQL folks (or other MemberAdaptor developers) on
whether we need more explicit transaction support or whether the
basically necessary hooks are already there.

Well, I can't really tell without (re-)grokking the code more, but in any
case an abort after a commit should not pose a problem; it's just a matter
of remembering state. Many SQL implementations won't even care if you do
'BEGIN WORK; <work>; COMMIT; ROLLBACK;' -- they'll give a notice, but not
abort anything.
When not doing anything transactional, it gets easier, of course. Maintain
all state in the Adaptor, and only commit something to a backend on the
Save() :)
...

We really need to optimize the MemberAdaptor API and the
implementations that use them.  Especially methods that return
lists, e.g. getMembers() and friends.  Right now, everything has to
return a list, but I could do much better by returning iterators,
because I can load my iterator up with a BDB cursor.  This has the
advantage of not requiring the entire member database to be loaded
into memory just to iterate over it.  Unfortunately, too much of the
rest of the code assumes these methods return lists, and while I
started to go down the iterator path, I backed out of it because of
the complexity.

This is highly backend dependent... In the XML-RPC case, you really don't
want an XML-RPC call to go out for every list member (especially not if the
xmlrpc library doesn't support/use keepalive.) On the other hand, getting
the entire list into the adaptor and then returning an iterator to that list
to Mailman might be suboptimal if Mailman ever has to decide between playing
convenient (a list) and playing nice (an iterator).
...
There are other optimizations that would require a bit more
thought.  E.g. the admin's Membership List page seems to require
that the entire member database be iterated over to chunkify and
bucketize.  Fixing this probably requires both changes to the u/i
and changes to the interface.  It also makes life more difficult for
OldStyleMemberships, although BDBMemberAdaptor can probably be
fairly easily elaborated.
And in SQL, the optimal way to do this is probably to count the number of
entries, and then split it into chunks, get whichever chunk is desired, and
the first and last entry of the other chunks. The count can be done entirely
in the SQL server, and is generally pretty damned fast. This will definately
pay off for very large lists, but it's not really trivial to expose all that
logic to the adaptor in a future-proof way.
Maybe-we-should-do-a-Mailman-Sprint-at-PyCon-Barry-ly y'rs :)
[*] 'NSA' is also the reason I'm not as active as I once was... I hate Perl,
much more so now than before I actually used it full-time. But, I get to
go to PyCon, so I haven't lost my soul completely yet :)
Thomas Wouters <thomas@xs4all.net>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

Re: [Mailman-Developers] Alternative MemberAdaptors

Thomas Wouters

[*] 'NSA' is also the reason I'm not as active as I once was... I hate Perl, much more so now than before I actually used it full-time. But, I get to go to PyCon, so I haven't lost my soul completely yet :)