[Spambayes] Handy mbox debug technique (was: mboxtrain croaks on
spam mbox file)
Andrew A. Raines
aaraines at pobox.com
Thu Sep 18 15:35:16 EDT 2003
"Andrew A. Raines" <aaraines at pobox.com> writes:
> Because an mbox doesn't really give me any placemarkers, and
> mboxtrain doesn't report the line number of the file on error,
> I had to split everything up. With a little reformail foo I
> created a separate one-file-per-directory directory for every
> message in the mbox which would explicitly show me on which
> message mboxtrain died on.
I had a request for the process I used to accomplish this, so
I'll share it with the public for the requester's sake and
others'.
Copy this directly into your interactive shell:
# create a diag workspace
mkdir /tmp/wk; cd /tmp/wk
# create a script to exploit reformail(1)'s -s functionality
cat <<EOF >split
#!/bin/sh
messdir=d-\$FILENO; mkdir \$messdir; cd \$messdir
cat >\$FILENO
EOF
chmod +x split
mkdir spam; cd spam
env FILENO=1 reformail -s ../split <spam.mbx
This creates an directory structure similar to:
$ pwd
/tmp/wk
$ find . | head -15
.
./split
./spam
./spam/d-1
./spam/d-1/1
./spam/d-2
./spam/d-2/2
./spam/d-3
./spam/d-3/3
./spam/d-4
./spam/d-4/4
./spam/d-5
./spam/d-5/5
./spam/d-6
./spam/d-6/6
So, in the /tmp/wk/spam directory, set mboxtrain aflame:
for i in *; do mboxtrain -d ~/.hammiedb-test -s $i; done
Now, you can see where chokage ensues:
Training spam (d-1689):
Reading as MH mailbox
Error:Traceback (most recent call last):
File "/home/aar/src/spambayes/mboxtrain.py", line 304, in ?
Message 1689. `more d-1689/1689'. Yay.
HTH,
-Drew
More information about the Spambayes
mailing list