[Mailman-Users] duplicate messages sent

Thu Jun 3 00:29:12 CEST 1999

[Brian Ryner]

> I've never had this happen before but it happened this morning.  A
> message was sent to the mailing list, and many people on the list got
> the message twice (but not everyone).  I'm suspecting something gone
> wrong in the dividing up of the mail into batches.  The system is
> running Slackware 3.2, Mailman 1.0rc1, Python 1.51. Has anyone ever seen
> anything like this?
> 
> I have a complete log of the mailing.... I'm not going to post it
> because of the number of email addresses involved, but if there's
> anything I should look for, please let me know.  I noticed this:
> 
> Jun  2 10:33:34 monm sendmail[15959]: KAA15959:
> from=<list-admin at host.com>, size=3445, class=-60, pri=3141445,
> nrcpts=101, msgid=<199906021510.KAA15936 at host.com>, proto=ESMTP,
> relay=bin at localhost [127.0.0.1]
> Jun  2 10:33:34 monm sendmail[15974]: KAA15974:
> from=<list-admin at host.com>, size=3445, class=-60, pri=3141445,
> nrcpts=101, msgid=<199906021510.KAA15936 at host.com>, proto=ESMTP,
> relay=mailman at localhost [127.0.0.1]
> Jun  2 10:36:13 host sendmail[15993]: KAA15959:
> to=<user1 at domain1>,<user2 at domain1>, ....
> ....
> Jun  2 10:41:24 host sendmail[15994]: KAA15974:
> to=<user1 at domain1>,<user2 at domain1>, ....
> 
> This is just an example, there were more smtp id's.  I noticed that the
> first line has relay=bin at localhost; the second has mailman at localhost...
> is that anything to worry about?

Yes, I think it is.

I guess that the bin at localhost line is the one triggered directly from
the /etc/aliases pipe command, and that the mailman at localhost one are
due to some cron-started queue runner.  The reasoning behind this
guesswork is that Mailman never does any setuid() call, so a change
from user "bin" to user "mailman" is ... not expected, at the very
least :)

Of course, these two delivery processes should _never_ interfer with
each other -- and from my reading of the source, I can't understand
why they would.  This is how it all goes:

The pipe command:
=================

  The actual delivery is done by the script scripts/contact_transport
  by doing OutgoingQueue.enqueueMessage() for each chunk of addresses,
  and then calling Utils.TrySMTPDelivery() on the enqueued files.

  enqueueMessage() creates a spool file in ~mailman/data/, with the
  setuid bit set to indicate that this is not yet a "deferred" spool
  file.

  TrySMTPDelivery() tries to contact the SMTP server.  On success, the
  message is delivered to the appropriate chunk of members, and the
  queue file is deleted.  On failure, the queue file's setuid bit is
  cleared (indicating to the next cron queue runner that the file is
  not in the middle of a "pipe command" delivery).

The cron queue runner:
======================

  cron/run_queue simply calls OutgoingQueue.processQueue().
  processQueue gets the cron queue runner global lock, and then
  iterates through all files in ~mailman/data/ looking for filenames
  matching the spoolfile naming convention (enforced by
  enqueueMessage()).

  Whenever a matching filename is found, it is stat()ed and further
  checking is done: If the file 
   1. Is not deferred (i.e. has the setuid bit set)
  and
   2. Has a ctime not older than two hours.
  then the file is skipped.

  If the file isn't skipped, Utils.TrySMTPDelivery() is called on it.

So, unless you have other factors (i.e. some mechanism clearing setuid
bits on "unknown" files, or a not very stable system clock, or some
other factor out of Mailman's reach) messing up this intricate
communication, I can't see how the problem you describe could have
occured... but if someone does, please let me know.
-- 
Harald