[Spambayes] Sharing database among multiple IMAP clients?
jsp at PKC.com
Thu Dec 8 14:19:12 CET 2005
I can't comment on the feasibility of replicating the database in
various ways. Instead, I'll make my standard claim: SpamBayes learns so
fast that it's just not worth worrying about sharing the database or
even backing it up. In fact, I throw my database away occasionally if
filtering seems to be getting worse. I generally find that within a day
or two I'm happier with the results, though my assessment is completely
subjective. (My databases tend to become unbalanced over time, with spam
outweighing ham by increasing margins. My hypothesis is that this is
because the content of the spam I receive evolves, so I have more
uncertains that need to be trained. I train on errors.)
I'm not advocating this particular regimen, just making the point that
SpamBayes training is easy and fast. By contrast, figuring out a way to
share the database is likely to be somewhat difficult and error-prone.
I wouldn't do it myself, but if this is the sort of thing that gets your
motor going, by all means indulge yourself!
> -----Original Message-----
> From: spambayes-bounces at python.org
> [mailto:spambayes-bounces at python.org] On Behalf Of Deneb Meketa
> Sent: Thursday, December 08, 2005 3:38 AM
> To: spambayes at python.org
> Subject: [Spambayes] Sharing database among multiple IMAP clients?
> Hello! I'm completely new to SpamBayes, and I'm trying to figure out
> how I want to set it up for myself. Thanks to the list in advance.
> I'll probably be using an IMAP server and accessing my mail from two
> main places - work and home.
> My question is: how can I (and should I, even) share the training data
> between the two locations? Ideally I'd like to maintain just one
> database, since this will all be a single set of mail; otherwise I'd
> have to repeat the same training at home and at work.
> Several possibilities occur to me, and I'm curious to hear
> other users'
> suggestions on these:
> 1. Manually copy the database file(s) from one location to the other.
> There are several variations on this. I could just do the initial
> training in one location and copy to the other, then maintain each
> database separately thereafter, expecting the follow-on
> training to take
> much less work. Or I could copy the DB each time I did some training.
> Or, if there's some reliable way to merge the two slightly different
> DBs, I could periodically do that.
> 2. Maintain the database file(s) on a server somewhere. This
> is really
> more what I want, but it's harder to arrange. The best thing,
> obviously, would be to run SpamBayes on the server, but I'm expecting
> that, for reliability and simplicity, I'll probably be using a shared
> commercial server where I can't install SpamBayes. But even if I'm
> running SpamBayes on the client, I could maintain a single DB in some
> online location. I could stick it on an FTP or rsync server,
> and have a
> script (or maybe even write a SpamBayes extension) that downloads it
> when I begin using SpamBayes and uploads it when I'm done. I could
> store the DB at work and map a Windows network drive to point
> at my work
> machine through my VPN, and configure SpamBayes at home to
> use the DB on
> the network drive - but my VPN isn't completely reliable, so
> that could
> be a hassle. Or perhaps there are distributed-DB features built into
> some of the storage options that SpamBayes can use?
> 3. Carry a little USB drive around with me, and keep the DB on that.
> (Does the DB get too big for this to be practical?)
> 4. Super Crazy Ninja Trick?: enhance the SpamBayes IMAP proxy with the
> ability to maintain a DB in a folder on the IMAP server, download it
> before beginning filtering, and upload it whenever it is modified. If
> this seems productive, and the feature doesn't yet exist, I'd be happy
> to add it if I can find the time.
> Surely others have run into the same DB-replication question.
> Is there
> any conventional wisdom on this yet? I didn't find anything obviously
> related in the wiki, the FAQ, or the mail archives. I realize the
> question probably does come down to a simple issue of "how do
> I share a
> file between multiple sites?", which isn't particular to
> SpamBayes, but
> maybe SpamBayes users are a good crowd to ask such a question of.
> Any help is much appreciated!
> Deneb Meketa,
> San Francisco.
> SpamBayes at python.org
> Check the FAQ before asking: http://spambayes.sf.net/faq.html
More information about the SpamBayes