
I'd like to produce improved bounce statistics from Mailman's logs; for example, I'd like to track, per message, how many recipients there were and how many of them bounced.
This means the logs need to include the message ID. Most of Mailman's logs (smtp, post, vette) include the ID, but the critical missing piece is the bounce log. To close the loop, I'l need to extract the ID of the message that was bounced. (The message ID of the bounce message itself isn't useful, and may not even exist; qmail's bounces apparently don't have IDs.)
How should I write the code to extract the ID? Looking through the bounce test messages, there are various formats so we'll need several functions, similar to how there are several bounced-address parsers in Mailman.Bouncers. Should I:
add a extract_message_id() function to all of the modules in Mailman.Bouncers, which currently just have a process() function?
have a new package, Mailman.Bouncers.MessageId or whatever, that has several modules, analogous to the existing bounce analysis?
have a bunch of analysis functions in one module, and a single master function that tries all of them?
I think 3) is the simplest course, but not too simple to be workable. Looking at the test bounces, finding the message ID is much simpler than finding the bounce address; searching for a small number of strings such as 'Original message follows' will often find the original headers. Can anyone see a reason that the more complicated
- or 2) would be necessary?
(I'll start a branch for this too, aimed at getting the change into 2.2.)
--amk