[Mailman-Developers] Enhancing the bounce log

Sat Mar 8 01:02:12 CET 2008

A.M. Kuchling wrote:
>
>How should I write the code to extract the ID?  Looking through the
>bounce test messages, there are various formats so we'll need several
>functions, similar to how there are several bounced-address parsers in
>Mailman.Bouncers.  Should I:

I don't think we do.

I ran the following

import os
import re
import email

hre = re.compile('^>?\s*message-id:\s*(<.*>)', re.IGNORECASE)
for f in os.listdir('.'):
    if not f.endswith('.txt'):
        continue
    msg = email.message_from_file(open(f))
    messageid = None
    inheaders = True
    for line in msg.as_string().splitlines():
        if inheaders:
            if line == '':
                inheaders = False
            continue
        mo = hre.search(line)
        if mo:
            messageid = mo.group(1)
            break
    print '%s: %s' % (f, messageid)

in current Mailman's test/bounces/ directory which contains 86 DSNs. Of
those 86, 12 have no message id for the original message. Of the
remaining 74, all message ids are found with the above.

If the re is changed to

hre = re.compile('^message-id:\s*(<.*>)', re.IGNORECASE)

73 of the 74 are found. llnl_01.txt has the 'original message' quoted
with '>' characters. A few mesages have the messsage id in a report
section with leading whitespace, but they all have it later as well
without leading whitespace.

In any case, I think the

hre = re.compile('^>?\s*message-id:\s*(<.*>)', re.IGNORECASE)

re will likely find anything to be found and is unlikely to find false
hits.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan