[spambayes-dev] default to mine_received_headers=True, "may be forged"

Skip Montanaro skip at pobox.com
Mon Dec 22 18:13:44 EST 2003


    >> Okay, I'll leave "(may be forged)" in and add Comcast's "(untrusted
    >> sender)".

    Tim> Don't you think this is a "stupid beats smart" kind of thing?  

For the moment I'd like to at least make a passing stab at understanding
what those phrases mean (or at least what generates them).

If anyone else would like to generate some raw data, you could run something
like this:

    from spambayes.mboxutils import getmbox
    import re, pprint
    d = {}
    for msg in getmbox("<Directory Full of Mail>"):
      hdrs = msg.get_all("received", ())
      for hdr in hdrs:
        for hit in pat.findall(' '.join(hdr.split())):
          d[hit] = d.get(hit,0)+1
    l = [(d[k], k) for k in d if d[k] > 2]
    l.sort()
    pprint.pprint(l)

using a relatively recent cvs checkout (one that has the more general
definition of getmbox()).  The conditional in the lc is just to trim the
output to a reasonable size.  Using a couple training databases I get:

    [(3, '(HELO bean)'),
     (3, '(HELO ckalin)'),
     (3, '(HELO default)'),
     (3, '(HELO laptop)'),
     (3, '(No client certificate requested)'),
     (3, '(authenticated user wgmachado)'),
     (3, '(may be fabricated)'),
     (4, '(HELO jim)'),
     (4, '(HELO vaio)'),
     (4, '(Postfix MTA)'),
     (4, '(account dave HELO nefarious)'),
     (4, '(verified OK)'),
     (5, '(HELO there)'),
     (6, '(HELO lion)'),
     (7, '(HELO bogdanm)'),
     (8, '(HELO opus)'),
     (8, '(misconfigured sender)'),
     (15, '(NEW ZEALAND STANDARD TIME)'),
     (15, '(untrusted sender)'),
     (17, '(HELO localhost)'),
     (18, '(from localhost)'),
     (19, '(SMTP Server)'),
     (26, '(MET DST)'),
     (28, '(NEW ZEALAND DAYLIGHT TIME)'),
     (435, '(may be forged)')]

I am really starting to worry about those kiwis.  Are these header phrases
part of their master plan for world domination?  Tom Ridge just raised our
alert level in the US to "orange".  Is there a correlation.  Do you think I
should call 9-1-1?

    Tim> be-stupid-be-happy<wink>-ly y'rs  - tim

Every time I try that I'm happy until Ellen hits me with a 2-by-4.  Then my
head hurts like hell for about three days.  <wink>

Skip



More information about the spambayes-dev mailing list