[Spambayes] a silly question
tameyer at ihug.co.nz
Mon Jan 17 00:54:06 CET 2005
> It seems to me that when I first installed it yesterday
> I somehow found a list of the training corpus, all the tokens
> and their frequency of occurrence. Today I cannot remember
> where I saw that list, nor how to get back to it. Can anyone help me?
If you want a list of the tokens for one particular message, you click the
"Tokens" link on the review page next to that message.
If you want to search for a particular token, you use the "word query" box
on the main page of the web interface. That box lets you use wildcards (or
regular expressions), so if you want to see every token in the database you
can just enter "*" (or ".*" for regex). It only shows a limited number (10
by default) in the results, but you can increase that (and the limited
display will tell you how many weren't shown). Note that displaying an
entire database could take some time (and it's possible that the browser
You can also use the sb_dbexpimp.py script (if you have the source) to
convert the database to a CSV file, and open it in (eg) Excel.
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.
More information about the Spambayes