[spambayes-dev] spammy subject lines
sosman at users.sourceforge.net
Fri Oct 10 19:23:20 EDT 2003
I am getting quite a bit of spam with subject lines like the following:
subject: Lon.g an^d Str;ong al)l Nigh_t j-jcgzies
subject: Ch-eck ou=t ou-r sel)ection _of grea)t R_X -emgffj
Looking at the tokenizer code for subject lines I was wondering if there was
value in stripping punctuation then doing the usual word tokenisation.
I seems there are other special cases taken into account for the subject
line so care would need to be taken not to break those.
I would be happy to have a crack at a patch if this hasn't been tried
already, I just wanted to float the idea first given that I am unfamiliar
with the existing codebase and unsure whether it might have already been
More information about the spambayes-dev