[Mailman-Developers] Re: Components and pluggablility
J C Lawrence
claw@kanga.nu
Thu, 14 Dec 2000 19:07:34 -0800
On Thu, 14 Dec 2000 19:05:25 -0500
Barry A Warsaw <barry@digicool.com> wrote:
> I like the idea of process queues, but I don't want to take the
> federation-of-processes architecture too far. Yes, we want a
> component architecture, but where I see the process boundaries is
> at the message queue level.
There are in essence seven queues:
1) inbound Message arrives at the MLM
2) authentication Do I accept it?
3) moderation Does the list accept it?
4) pending Associate a distribution list with message
5) outbound Send it.
5) bounce Demote the subscriber
7) command A combo of #2, #3, and command processing.
There's a possible eighth for OOB stuff like archiving and digests,
which I mostly see as a fork off the side of the pending queue.
> So in my view, when Mailman decides that a message can be
> delivered to a membership list, it's dropped fully formed in an
> outbound queue.
Not exactly. It drops a mesasge, any relevant meta data, and a
distribution list in the outbound queue. A delivery process then
takes that and does what it will with them (eg VERP, application of
templates, etc).
Process pipes...
> The file formats are the interface b/w Mailman and the queue
> runners and should be platform (i.e. Python) independent.
Bingo. This is a point I've invested considerable time into.
> That way, I can ship a simple queue runner that takes messages
> from the outbound queue and hands them off to the smtpd, but /you/
> could drop in a different runner process that uses GNQS to
> distribute load across an infinitely expandable smtpd server farm.
If you continue the same abstraction across all queues and the
staging processes of queus, you build something that isn't
inherently a queue run-system, it merely looks like one and can in
fact be fairly trivially hung off a queue based system (MQM or
whatever).
Consider the following setup:
Three machines:
HostA is the primary MX and receives the list mail along with
all mail for the rest of the site.
HostB has a private hole in the firewall and is the only host to
have access to the backing stores for authentication and
membership data.
HostC has a nicely tuned MTA built for outbound processing.
Given a queue bases system supporting that, or something several
dozen times more complex, becomes trivial. The problem is in making
the architecture that runs on a single host without an external
queue manager the same as the system above where different hosts
each take responsibility for different queues in the message system.
It can be don, it just requires a little elegance.
> [Side note. Here's another reason why I'm keen on ZODB/ZEO as the
> underlying persistency mechanism for internal Mailman data: I
> believe we can parallelize the moderate-and-munge part of message
> processing. Because the ZEO protocols serialize writes at commit
> time, you could have multiple moderate-and-munge processes running
> on a server farm and guarantee db consistency across them.
There are problems with this due to the fact that external
transactions (such as SMTP sends) are asynchonous and not nested in
ZODB transactions.
>>>>>> "JCL" == J C Lawrence <claw@kanga.nu> writes: "CVR" == Chuq
>>>>>> Von Rospach <chuqui@plaidworks.com> writes:
> These processes are not completely independent of Mailman though,
> e.g. for handling hard errors at smtp transaction time or URL
> generation for summary digests. Some of these can be handled by
> re-injection into the message queues (i.e. generate a bounce
> message and stick it in the bounce queue), but some may need an
> rpc interface.
Thus the pending queue above -- it allows a mesasge to undergo a set
of pre-post filters prior to landing in the outbound
queue. Archiving, digests, all sorts of things can happen at that
point.
--
J C Lawrence claw@kanga.nu
---------(*) http://www.kanga.nu/~claw/
--=| A man is as sane as he is dangerous to his environment |=--