Sharing database among multiple IMAP clients?
Hello! I'm completely new to SpamBayes, and I'm trying to figure out how I want to set it up for myself. Thanks to the list in advance. I'll probably be using an IMAP server and accessing my mail from two main places - work and home. My question is: how can I (and should I, even) share the training data between the two locations? Ideally I'd like to maintain just one database, since this will all be a single set of mail; otherwise I'd have to repeat the same training at home and at work. Several possibilities occur to me, and I'm curious to hear other users' suggestions on these: 1. Manually copy the database file(s) from one location to the other. There are several variations on this. I could just do the initial training in one location and copy to the other, then maintain each database separately thereafter, expecting the follow-on training to take much less work. Or I could copy the DB each time I did some training. Or, if there's some reliable way to merge the two slightly different DBs, I could periodically do that. 2. Maintain the database file(s) on a server somewhere. This is really more what I want, but it's harder to arrange. The best thing, obviously, would be to run SpamBayes on the server, but I'm expecting that, for reliability and simplicity, I'll probably be using a shared commercial server where I can't install SpamBayes. But even if I'm running SpamBayes on the client, I could maintain a single DB in some online location. I could stick it on an FTP or rsync server, and have a script (or maybe even write a SpamBayes extension) that downloads it when I begin using SpamBayes and uploads it when I'm done. I could store the DB at work and map a Windows network drive to point at my work machine through my VPN, and configure SpamBayes at home to use the DB on the network drive - but my VPN isn't completely reliable, so that could be a hassle. Or perhaps there are distributed-DB features built into some of the storage options that SpamBayes can use? 3. Carry a little USB drive around with me, and keep the DB on that. (Does the DB get too big for this to be practical?) 4. Super Crazy Ninja Trick?: enhance the SpamBayes IMAP proxy with the ability to maintain a DB in a folder on the IMAP server, download it before beginning filtering, and upload it whenever it is modified. If this seems productive, and the feature doesn't yet exist, I'd be happy to add it if I can find the time. Surely others have run into the same DB-replication question. Is there any conventional wisdom on this yet? I didn't find anything obviously related in the wiki, the FAQ, or the mail archives. I realize the question probably does come down to a simple issue of "how do I share a file between multiple sites?", which isn't particular to SpamBayes, but maybe SpamBayes users are a good crowd to ask such a question of. Any help is much appreciated! Best, Deneb Meketa, San Francisco.
My question is: how can I (and should I, even) share the training data between the two locations? Ideally I'd like to maintain just one database, since this will all be a single set of mail; otherwise I'd have to repeat the same training at home and at work.
Firstly, I agree with Jesse that it's really not worth sharing the databases - just train each one separately and you should have good results.
1. Manually copy the database file(s) from one location to the other. There are several variations on this. I could just do the initial training in one location and copy to the other, then maintain each database separately thereafter, expecting the follow-on training to take much less work.
Note that we don't recommend doing any initial training. We recommend only training on any mistakes and unsures.
2. Maintain the database file(s) on a server somewhere.
If you do want to do this, then a good choice would be to use ZEO, for which there is already a (relatively untested) storage system written for SpamBayes. There're also mySQL and postgreSQL options.
3. Carry a little USB drive around with me, and keep the DB on that. (Does the DB get too big for this to be practical?)
Depends how big the USB drive is <wink>. This should work fine.
4. Super Crazy Ninja Trick?: enhance the SpamBayes IMAP proxy with the ability to maintain a DB in a folder on the IMAP server, download it before beginning filtering, and upload it whenever it is modified. If this seems productive, and the feature doesn't yet exist, I'd be happy to add it if I can find the time.
Storing the database remotely in some other way (e.g. ZEO) would be a more sensible method, IMO. =Tony.Meyer
Deneb Meketa wrote:
My question is: how can I (and should I, even) share the training data between the two locations? Ideally I'd like to maintain just one database, since this will all be a single set of mail; otherwise I'd have to repeat the same training at home and at work.
What did you eventually do? I have a very similar problem and I'd be interested to hear what worked for you. Rowan -- View this message in context: http://www.nabble.com/Sharing-database-among-multiple-IMAP-clients--tf701141... Sent from the Python - spambayes forum at Nabble.com.
participants (3)
-
Deneb Meketa -
Rowan -
Tony Meyer