[Spambayes] sharing split database

Meyer, Tony T.A.Meyer at massey.ac.nz
Wed May 21 12:48:30 EDT 2003


> I'm not sure that this would gain anything; in order to make 
> the weights usable, they need to be indexed by word, which 
> effectively puts the words right back in with them.  I 
> suppose you could assign each word a unique hash key or 
> something, and then index the weights by that...
[...]

Interesting that this came up at this time.  There was a message the
other day about using a SQL database behind spambayes (for the multiple
user situation).  The suggested tables did this - there was a table of
tokens and the stats were referenced by token id.  I implemented a
storage class in this style, but it did seem at the time that it was
unnecessary overhead.  I presume the reasoning behind the design was the
same reasoning in this thread.

> I don't have any particular objection to this... do you have 
> a snippet of code to extract the wordlist from a db, for 
> those of us too lazy to come up with it on our own?

I note that there is a code snippet that uses sha, but for (everyone's)
future reference, it might be worth pointing out that you can use
dbimpexp.py (in the main directory) to extract a db (or pickle) to a `
separated text file.

=Tony Meyer



More information about the Spambayes mailing list