[spambayes-dev] subject parsing

Seth Goodman nobody at spamcop.net
Mon Jan 26 15:45:38 EST 2004

Just in passing, I noticed that a spam with the following subject line:

Try a Free H-G-H sample ... $49.95 value!

generated the following subject tokens (from 'All Message Tokens' in spam

'subject: '
'subject: ... $'

Three observations:

1) having a token for '-' but not for 'H-G-H' appears to be ignoring
important information

2) a tokens for a single space seems of dubious value, but if it worked
better in testing, fine

3) the token for ' ... $' seem to be an odd choices for parsing

Seth Goodman

off-list replies to sethg [at] GoodmanAssociates [dot] com

More information about the spambayes-dev mailing list