[Mailman-Users] optimizing mail delivery

Thu Nov 18 03:04:21 CET 1999

On Wed, 17 Nov 1999 20:37:58 -0500 (EST) 
Barry A Warsaw <bwarsaw at cnri.reston.va.us> wrote:

> So do you think the Mailman way is better or worse?  I'm curious
> because I'm trying to decide whether I should port Mailman 1.0's
> bulk mailer code to the new message pipeline.

MailMan's current code is definitely in the "good enough"
category.  

> In the current 1.2 code base (available via the anonCVS), I
> os.popen() sendmail[1] passing the entire recipient list on the
> command line, then I pipe the message text to stdin and let the
> MTA do the rest.  There's one complication; if the length of the
> recipients list is greater than a certain length (current 3000), I
> chop it up into multiple popens, but I don't do any sorting of the
> recipient list.

My view is that appropriate and efficient handling of mail for
delivery is the domain of the MTA, and should not be the domain of
MUA's (in which camp MailMan sorta falls).  As such domain sorting
is pleasant, it really only acts to further bolster technically
lagging mail servers which are in need of new life anyways.

> The advantage I see of this is that sendmail can do it's thing
> asychronously, without keeping the list object locked the entire
> time.  The disadvantage is that Mailman is only aware of delivery
> problems if the delivery bounces.

Which of course requires a local MTA, which the current design
doesn't.

What actually needs to happen (presuming my current understanding of
the source is correct) is that the abstraction of MailMan's internal
mail queue needs to be finished.  Then, a mail broadcast attempt
would only stuff messages (cheaply) into the MailMan queue
mechanism, and then return, unlocking the list objects.  The queue
object of course can be lock free (lockless DB's aren't difficult)
and you then merely need to run a could of MailMan queue runners to
pipe it to the MTA.

One the messages are stuffed, you can then fork a queue runner,
which, if there are already sufficient queue runners immediately
dies.  If more queue runners are needed however, it proceeds to grab 
messages and stuff them at the MTA in the normal fashion over SMTP.

Note:  If you do this adding VERP support becomes a doddle.

> I'm leary though of stepping on too much of the MTA's toes -- a
> good MTA should just do the right thing.

This is perhaps the best point.  Spending time to further bolster
fading technologies when better services are freely available hardly 
seems worth it.  

> Another alternative, which would be less work and delegates all
> delivery to the MTA, is to just pump all the recips to the local
> smtpd via smtplib.py.  The advantage here is that again we're MTA
> independent, but the disadvantage is that Mailman's delivery is
> synchronous with the smtpd.  We'd have to be very sure to unlock
> the list object during this transaction (but watching out for race
> conditions, locking again if failure status's are handled
> directly, etc.)  More code, more opportunity for bugs.

No no no no no no.  MailMan oeprates asynchronously of the MTA, and
because as a list server it generates unusual loads in itself, it
cannot be subject to the perfoamcen vagaries of the MTA.

Consider the case I used to commonly run into:

  Message arrives and is delivered to MailMan.

  MailMan explodes that message into a couple thousand more messages 
  (100K+ subscribers).

  MailMan attempts to hand off messages to MTA.

  MTA refuses connections as system load is too high.

  MailMan bitches.

  Meanwhile another message arrives at MailMan to be exploded.

> Comments?

Nope!

-- 
J C Lawrence                              Internet: claw at kanga.nu
----------(*)                            Internet: coder at kanga.nu
...Honorary Member of Clan McFud -- Teamer's Avenging Monolith...