[Mailman-Users] failing qrunner
jaco at kroon.co.za
Sat Sep 15 22:50:45 CEST 2007
Mark Sapiro wrote:
> Jaco Kroon wrote:
>> What I want to know is how mailman handles the message delivery runs.
>> Afaik each message that needs to go out is stored in some location,
>> along with a list of recipients, so periodically mailman checks which
>> messages needs to go out, and to which recipients, and it then tries to
>> make those deliveries, removing the recipients that it successfully
> That is correct.
> Assuming this is at least Mailman 2.1.x, the messages to be sent are
> placed in Mailman's 'out' queue (normally Mailman's qfiles/out/
> directory) and picked up and delivered by OutgoingRunner. If the MTA
> returns a non-retryable failure for one or more recipients, that is
> logged in Mailman's smtp-failure log and treated as a bounce for the
> failed recipients.
I gathered that the queue mechanism is from 2.1.x as I did locate the
out/ directory and the .pck files in there for each outbound message.
> If the MTA returns a retryable failure for one or more recipients, that
> is also logged in Mailman's smtp-failure log and the message is queued
> in the 'retry' queue for delivery to the failed recipients. Every 15
> minutes, RetryRunner moves the message from the retry queue back to
> the out queue.
Ok. That covers the 4xx and 5xx responses to rcpt to:, what happens if
the MTA simply closes the connection? What I gathered the smtp
conversation had to look like was something like:
S: 220 servername ESMTP Exim ....
C: helo servername
S: 250 servername Hello localhost [127.0.0.1]
C: mail from: <???-bounces at hostname>
S: 250 OK
C: rcpt to: <legal at addr>
S: 250 OK
C: rcpt to: <illegal at addr?>
S: --- force close connection ---
Now, the problem here is that you don't really know whether it's a 5xx
or a 4xx error code, and it actually looks like the entire run for that
message gets interrupted and put to sleep in it's entirety. Thus may
have been a bug that got fixed at some point (I don't even know which
exact version of mailman I'm working with, but it's at the latest
something released around Feb 2007).
So at this point it simply wouldn't continue any further, and
smtp-failures actually logs the address after the faulty one as the one
causing a problem.
> This continues for DELIVERY_RETRY_PERIOD (default 5 days) after which,
> Mailman gives up on this message.
>> Is there a manual way to remove the problem-causing email
>> addy from this list for the particular message? We've already removed
>> it from the main list so it won't cause issues in future but it's now
>> holding up the delivery of an already sent message.
> First find the entry (a long, mostly numeric, name ending in .pck) in
> qfiles/retry, and move that file aside. Then use Mailman's bin/dumpdb
> to dump the file. This will output the raw message and the message
> metadata. The metadata contains a list of 'recips' which is the
> addresses remaining to be delivered.
I saw the dumpdb program, had no idea what it does though. Now I do,
and it'll make my life a lot easier next time. Any way to repack the file?
> If you are proficient in Python, you could write a short script to
> unpickle the message and metadata from the file, remove the bad
> recipient from recips and repickle the message and metadata. then you
> could put the file in qfiles/out for delivery. (I'm currently
> debugging one I just wrote - I'll post a link soon).
... or issue mailmanctl stop, use vim on the file, find the invalid
address and without changing the size of the file change the address to
an RFC legal address that is bogus, ie, jaco at kroon.co.za? can be changed
to jaco at kroon.co.zaa which causes the pickle to not break, and will
cause exim to not close the connection ... instead it will bounce back
to mailman, harmlessly since this server isn't using VERP.
> Alternatively, you could just remail the message outside of mailman to
> the remaining recipients.
That could have been simpler.
More information about the Mailman-Users