[Spambayes] Habeas marked email

Michael Kimball michael at kimballpottery.com
Mon May 10 09:51:19 EDT 2004

Since so far the only Habeas marked email I've received has been from
Habeas itself, and I am unlikely to receive any other legitimate Habeas
marked mail, this may be a rather pointless exercise.  If you DO drop it
from the 1.0 release, is there any way I (or any other SB user) could
add it back in.  I thought I saw something about adding custom
tokens/filters/wudayumucallums but I can't find it now.

If I'm understanding the purpose/focus of Habeas, the only legitimate
Habeas marked email I'd receive would be from mail lists I sign up for. 
If it isn't used by most of those lists I'd be unlikely to receive any. 
Hmmm maybe I should check the headers in some of the lists I've
subscribed to but ignore.  Maybe it is already there.  (I subscribed to
a few that I no longer read, but aren't ready to unsubscribe from.  I've
just set filters to drop them directly in the Trash folder for now)

Tony Meyer wrote:
> > Just found the Habeas related buttons in the experimental
> > configuration page and am wondering how habeas marked mail is
> > used.  Presumably SpamBayes weights it towards 'Ham', but
> > what happens then?
> SpamBayes treats it as just another token (or nine tokens, depending on the
> second option), which might be towards ham or might be towards ham,
> depending on the presence of habeas headers in mail you've trained on.  If
> the message has nine valid habeas headers, you've trained ham and no spam
> with nine valid habeas headers, then that'll be nine more strong ham clues
> for that message.  If, on the other hand, habeas headers are equally likely
> in ham and spam for you, then the clues will be neutral.  Note that it'll
> churn out a different clue depending on whether the header is valid or not
> (so valid headers could be ham clues and invalid headers spam clues, for
> example).

I've given this a bit more thought since my first post.  I don't think
there will be anything in the email that will point to Habeas headersa
being valid or invalid.  I think it is the server-based spam filters
that determine this, by checking the Habeas whitelist.

> > I normally read my email with just 'Normal' headers visible,
> > so I wouldn't see the Habeas headers so would be unlikely to
> > report Habeas abuse to Habeas.com.
> The way SpamBayes uses the headers doesn't really lend itself to reporting
> abuse, since it's not obvious when this occurs (there's no "I think this is
> spam, but it has the habeas headers" message, for example).
> However, if a spam has the headers, and they are strong ham clues for you,
> then it might be more likely to end up classified as unsure/ham.  In this
> case, you'd look at the clues SpamBayes generated, see the habeas tokens,
> and realise that you need to report it as abuse.

O.K. This makes sense.  I should make a filter in my mail client to look
for SpamBayes classified 'Spam' email that DOES contain Habeas marks. 
So there would be strong possibility that anything caught in that folder
would be abused/invalid Habeas mail.

> If you do end up turning these options on, it'd be great to hear back
> whether you feel they helped at all.  Since they're an experimental option,
> they're fairly likely to disappear in the first post-1.0 release, since they
> haven't really demonstrated that they make a significant improvement to
> results (habeas headers just aren't that widely used).  Feedback from users
> would help when making the decision whether to cut the option or not.

I guess unless and until I get real email spam with counterfeit Habeas
headers, I could cut and paste them into already received spam and
retrain SB on those.

> =Tony Meyer
> ---
> Please always include the list (spambayes at python.org) in your replies
> (reply-all), and please don't send me personal mail about SpamBayes. This
> way, you get everyone's help, and avoid a lack of replies when I'm busy.

More information about the Spambayes mailing list