bill parducci bill at parducci.net
Mon Mar 31 15:21:24 EST 2003

currently, does spambayes treat a URL as a single token or is it parsed 

it would seem that if URLs were parsed you would be able to train 
spambayes to detect mail for odious content based on components of the link.

take the example: http://check.myspam.com/ad/junk?random=fsldkjflksj

it would seem that the most accurate way to evaluate this would be to 
parse using '/' (starting after 'http://'). that would allow spambayes 
to evaluate the domain (check.mypam.com) while giving it the ability to 
differentiate between directories (which may map to users on ISP 
systems: http://user.aol.com/niceguy vs. http://user.aol.com/spammer).


