[Mailman-Developers] Adding headers to mailman generated mails

Thu Jan 22 18:56:24 EST 2004

On Thu, 2004-01-22 at 18:21, Brad Knowles wrote:

> 	This is similar to what Eric Allman (at that time, before 
> Sendmail Inc. existed), Bryan Costales (at the time, working for 
> InfoBeat/Mercury Mail), and I (working at AOL) were discussing back 
> in 1996, in the creation of a Mail-Merge Transport Protocol (MMTP) 
> server, based on a modified version of sendmail along with a standard 
> language for transmitting that content.  With MMTP servers on both 
> ends, it would not matter how many thousands or millions of 
> recipients you might have, only one copy of the message body would be 
> transmitted, and all the rest would be filled in on the remote end.
> 
> 	We ultimately gave up on this idea because we realized that it 
> would make the spam problem much, much worse.  The same things that 
> help regular MTAs transmit millions of customized messages per hour 
> to their paying customers would probably allow spammers to transmit 
> billions of messages per hour to everyone in the universe.

Very interesting.  Some thoughts: there would still be some benefit here
for Mailman if we simply limited access to the MMTP to the localhost
interface.  That's how Mailman hands stuff off to the MTA, and in our
Exim configuration, we do quite a bit of special casing for localhost
connections.  The advantage here is that Mailman could go back to
batching deliveries to its worker mail server, reducing both the
bandwidth between the two processes and the disk i/o on the worker mta.

(Read 'localhost' as privileged connection, e.g. Mailman feeding a smurf
farm.)

I had been thinking along the lines of the language for specifying the
data source as a db connection and a SQL command.  I wouldn't want to do
that across the Internet!  OTOH, with a protocol like MMTP, I suppose
you'd have to send all the data for all the recipients in the same
transaction, and the bandwidth trade-off would depend on the size of the
recipient-centric data.

> 	Certainly, before any serious discussion of creating something 
> like an MMTP server, and trying to make that a standard which you 
> would expect programs like sendmail, postfix, and Exim to implement, 
> I believe that the spam issue needs to be addressed.  You need to be 
> able to prove how this cannot be abused to generate spam instead.

That's certainly tricky, but I think it's got to boil down to privilege
or authentication.  It would still make me nervous to accept such jobs
from other than sites I control.

As an aside: the spam issue is already a huge nightmare for list
servers.  For example, every once in a while we get spamcop reports
targeting python.org.  Why is that?  Well, we filter all email destined
to our lists through various levels of spam defenses, but crap does slip
through.  And then /we/ get flagged as the originator of the spam. 
That's just one issue related to spam we have to deal with.

> >  If the MTA could do what Mailman does here -- not creating a disk image
> >  for each instance of the message, but stitching it together in member as
> >  it's going out on the wire -- I think you'd greatly improve disk
> >  contention.
> 
> 	I'm not sure that the MTA could safely do that in memory.  At 
> least, it would be difficult to ensure that the MTA gets this done 
> right.  This would be akin to handling the entire message queue in 
> memory for all messages, something which can't really be done safely 
> except under very strict circumstances.

What I was thinking was something along the lines of storing the
template and the 'job description', a concise definition of how to get
the recipient-centric data.  The jobs would have to be small enough so
that they could be reliably dequeued, stitched and sent while still
making the guarantees an MTA has to make.  I'm just hand-waving here of
course, and the rest is left as a simple matter of engineering <wink>.

> >  In a sense, that's what we've talked about before.  If there were a
> >  standard language that the mail server and list manager could agree on
> >  for both defining the template, and defining the per-recipient data
> >  source, we could have a more efficient mechanism, with perhaps a hope of
> >  mta agnosticism.
> 
> 	That would be nice.  However, I fear that we have much more basic 
> problems that are much more serious, and which need to be resolved 
> before we can expect to start worrying about such subjects as 
> increasing efficiency in the interfaces between MLMs and MTAs.

I should mention that I'm specifically interested in increasing the
efficiency between Mailman and its local worker MTAs.  These are all
systems under my control so I should be able to tune them, set up
privileges, common data source access, etc. to make things work as
smoothly as possible until the message hits the external outgoing
interface.  After that, we have to play nice and standard.

I'm not even touching the 3rd rail of putting the MTA /in/ Mailman any
more :).

-Barry