Paul Moore lists@morpheus.demon.co.uk
Fri Nov 8 21:07:45 2002

"Tim Peters" <tim@zope.com> writes:

[About the plugin code...]
> I'm more lost than not in it myself!

That makes me feel better :-)

[About bothering with leaving list traffic out]
> Don't worry about it before you try it.  I suggest trying it because I'm not
> sure it's possible to *stop* the system now from scoring all incoming msgs
> (the "new msg in Inbox" filter appears to trigger for every one, regardless
> of whether the RW decides to move it; after that it may just be a race
> between the RW and the addin deciding where to move each).

OK, I've switched over. I now have one Spam folder, one Potential Spam
folder, and the rest are Ham (actually, some historic archive folders
I've left out, but that's just because I never use them any
more). We'll see how it goes.

>> Of course, I know that the classifier *really* works by magic, and
>> so my intuition is useless :-)
> It's more that unless you know exactly how the math works, your intuition is
> simply baseless here, carried over from some other experience.  Do *you*
> have trouble distinguishing personal and work email from spam?  There you
> go, and you can't even compute inverse chi-squared probabilities to 14
> significant digits on demand in your head <wink>.

How do *you* know I can't compute inverse chi-squared probabilities in
my head? Oh, hang on - you wanted me to get the right answer, didn't
you? :-)

> What's to manage?  I get about 600 emails per day, and about 1% end
> up in Unsure (about 6 -- actually less than that, lately; the system
> is learning).

My ratio is still a lot worse than that. But as I say, my training
corpus is still quite small. But you're right - managing a few mails
isn't hard. It's just that the overall results are *so* much better
than the old home-grown soution I used that I became instantly spoiled

Seriously, I've said this before, but what you guys have developed
here is *phenomenally* good. I've reached the point where I look
forward to getting spam, just because I enjoy so much seeing it
automatically appear in the spam folder :-)

>> My instinctive reaction is that I want "Spam" and "Not Spam" buttons,
>> and then I read or delete the message in situ.
> MarkH has since implemented this in the Unsure folder.

Time for a CVS update, I guess...

> I still think you're making life too complicated.  Is list traffic
> spam?  If so, call it spam.  If not, call it ham.

Sounds sensible. I think that all the troubles I've had in the past
trying to manage spam have left me with an instinctive feeling that
the problem is complicated. This leads to looking for complicated

But you're right. The spam/ham distinction itself is a simple yes/no,
so the setup should be, too.

But permit me to drag my feet a little, as I throw away all my
cherished preconceptions :-)

More seriously, I'm putting this point into my spambayes notes
folder. I suspect it's something a lot of new users will have to get
used to.

Thanks for the comments,

