[Spambayes-checkins] spambayes/testtools urlslurper.py,1.6,NONE

Tony Meyer anadelonbrin at users.sourceforge.net
Wed Dec 17 04:16:56 EST 2003


Update of /cvsroot/spambayes/spambayes/testtools
In directory sc8-pr-cvs1:/tmp/cvs-serv671/testtools

Removed Files:
	urlslurper.py 
Log Message:
Add the basis of a new experimental (and highly debatable) option to 'slurp' URLs.

This is based on the urlslurper.py script in the testtools directory, which in turn
was based on Richard Jowsey's URLSlurper.java.

Basically, when the option is enabled, instead of just tokenizing the URLs in a message,
we also retrieve the content at that address (if it's not text, we ignore it).

When classifying, if the message has a 'raw' score in the unsure range, and if the
number of tokens is less than max_discriminators, and adding these 'slurped' tokens
would push the message into the ham/spam range, then they are used.

This isn't necessary anymore; use the experimental URLRetriever options
instead.

--- urlslurper.py DELETED ---





More information about the Spambayes-checkins mailing list