Re: [Mailman-Developers] Enhancing the bounce log

March 8, 2008 · *message-id:\s*


      A.M. Kuchling wrote:
...
How should I write the code to extract the ID?  Looking through the
bounce test messages, there are various formats so we'll need several
functions, similar to how there are several bounced-address parsers in
Mailman.Bouncers.  Should I:
I don't think we do.
I ran the following
import os
import re
import email
hre = re.compile('^>?\s*message-id:\s*(<.*>)', re.IGNORECASE)
for f in os.listdir('.'):
if not f.endswith('.txt'):
continue
msg = email.message_from_file(open(f))
messageid = None
inheaders = True
for line in msg.as_string().splitlines():
if inheaders:
if line == '':
inheaders = False
continue
mo = hre.search(line)
if mo:
messageid = mo.group(1)
break
print '%s: %s' % (f, messageid)
in current Mailman's test/bounces/ directory which contains 86 DSNs. Of
those 86, 12 have no message id for the original message. Of the
remaining 74, all message ids are found with the above.
If the re is changed to
hre = re.compile('^message-id:\s*(<.*>)', re.IGNORECASE)
73 of the 74 are found. llnl_01.txt has the 'original message' quoted
with '>' characters. A few mesages have the messsage id in a report
section with leading whitespace, but they all have it later as well
without leading whitespace.
In any case, I think the
hre = re.compile('^>?\s*message-id:\s*(<.*>)', re.IGNORECASE)
re will likely find anything to be found and is unlikely to find false
hits.
--
Mark Sapiro <mark@msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan