[Spambayes] Spambayes database related question...

skip at pobox.com skip at pobox.com
Mon Aug 20 02:30:04 CEST 2007


    peter> Have you given any thought to running a central database on a
    peter> local network SpamBayes 'Server'?

There are both MySQL- and PostgreSQL-based classifiers in the SpamBayes
repository.  They are both almost entirely untested.  There is also a
ZEO-based classifier (ZEO == centralized ZODB I believe) in the 1.1 alpha
series.

    peter> I wish SpamBayes would also keep a list/db of offending network
    peter> address path information within the email internet
    peter> headers... would be extremely useful to generate statistical
    peter> reports for frequently encountered offending IPs etc.

You can get partway there by adding this to your SpamBayes INI file:

    [Headers]
    include_evidence:True

    [Tokenizer]
    mine_received_headers:True
    x-pick_apart_urls:True

You'll then get IP address-related tokens when they are significant.  For
example, here are some IP bits from a few random messages I have in my
mailbox right now:

    'url-ip:194.109.207.14/32': 0.35;
    'url-ip:194.109.207/24': 0.35;
    'url-ip:194.109/16': 0.35;
    'url-ip:88.198/16': 0.09;
    'url-ip:88/8': 0.09 ;
    'received:192.168.1': 0.21;
    'received:10.3.1': 0.16;
    'received:10.3.1.93': 0.16;
    'received:209.191': 0.16;
    'received:66.35.250.225': 0.16;

You can then analyze them at your leisure.

Skip


More information about the SpamBayes mailing list