[Spambayes] What the heck

Tim Peters tim.one at comcast.net
Tue Dec 10 17:50:30 EST 2002


> http://spamland.org/jsp/Wiki?ToDestroySpamIncludeThisLinkInAllLegitEmails


> It generates 6 tokens:
>
> proto:http
> url:spamland
> url:org
> url:jsp
> url:wiki
> url:todestroyspamincludethislinkinalllegitemails


> skip_max_wordsize only applies to words, not to url fragments?

It does not apply to url fragments.  As to the first half of the question,
tokenizer.py is open for inspection <wink>.




More information about the Spambayes mailing list