[Spambayes] Handy mbox debug technique (was: mboxtrain croaks on spam mbox file)

Andrew A. Raines aaraines at pobox.com
Thu Sep 18 15:35:16 EDT 2003


"Andrew A. Raines" <aaraines at pobox.com> writes:

> Because an mbox doesn't really give me any placemarkers, and
> mboxtrain doesn't report the line number of the file on error,
> I had to split everything up.  With a little reformail foo I
> created a separate one-file-per-directory directory for every
> message in the mbox which would explicitly show me on which
> message mboxtrain died on.

I had a request for the process I used to accomplish this, so
I'll share it with the public for the requester's sake and
others'.

Copy this directly into your interactive shell:

    # create a diag workspace
    mkdir /tmp/wk; cd /tmp/wk

    # create a script to exploit reformail(1)'s -s functionality
    cat <<EOF >split
    #!/bin/sh
    messdir=d-\$FILENO; mkdir \$messdir; cd \$messdir
    cat >\$FILENO
    EOF
    chmod +x split

    mkdir spam; cd spam
    env FILENO=1 reformail -s ../split <spam.mbx

This creates an directory structure similar to:

    $ pwd
    /tmp/wk
    $ find . | head -15
    .
    ./split
    ./spam
    ./spam/d-1
    ./spam/d-1/1
    ./spam/d-2
    ./spam/d-2/2
    ./spam/d-3
    ./spam/d-3/3
    ./spam/d-4
    ./spam/d-4/4
    ./spam/d-5
    ./spam/d-5/5
    ./spam/d-6
    ./spam/d-6/6

So, in the /tmp/wk/spam directory, set mboxtrain aflame:

    for i in *; do mboxtrain -d ~/.hammiedb-test -s $i; done

Now, you can see where chokage ensues:

    Training spam (d-1689):
      Reading as MH mailbox
    Error:Traceback (most recent call last):
      File "/home/aar/src/spambayes/mboxtrain.py", line 304, in ?

Message 1689.  `more d-1689/1689'.  Yay.

HTH,

-Drew




More information about the Spambayes mailing list