[spambayes-dev] RE: [Spambayes-checkins] spambayes/testtools
urlslurper.py, 1.6, NONE
tameyer at ihug.co.nz
Wed Dec 17 04:21:45 EST 2003
Opps. The comment window had scrolled down and I didn't notice. Only the
last line should be there in the comments for this.
> -----Original Message-----
> From: spambayes-checkins-bounces at python.org
> [mailto:spambayes-checkins-bounces at python.org] On Behalf Of Tony Meyer
> Sent: Wednesday, 17 December 2003 10:17 p.m.
> To: spambayes-checkins at python.org
> Subject: [Spambayes-checkins] spambayes/testtools
> Update of /cvsroot/spambayes/spambayes/testtools
> In directory sc8-pr-cvs1:/tmp/cvs-serv671/testtools
> Removed Files:
> Log Message:
> Add the basis of a new experimental (and highly debatable)
> option to 'slurp' URLs.
> This is based on the urlslurper.py script in the testtools
> directory, which in turn
> was based on Richard Jowsey's URLSlurper.java.
> Basically, when the option is enabled, instead of just
> tokenizing the URLs in a message,
> we also retrieve the content at that address (if it's not
> text, we ignore it).
> When classifying, if the message has a 'raw' score in the
> unsure range, and if the
> number of tokens is less than max_discriminators, and adding
> these 'slurped' tokens
> would push the message into the ham/spam range, then they are used.
> This isn't necessary anymore; use the experimental
> URLRetriever options
> --- urlslurper.py DELETED ---
> Spambayes-checkins mailing list
> Spambayes-checkins at python.org
More information about the spambayes-dev