[Spambayes] locking pickle/dbm against concurrent access?

Sjoerd Mullender sjoerd at acm.org
Mon Jan 20 16:23:49 EST 2003


On Mon, Jan 20 2003 Skip Montanaro wrote:

> 
> Depending on how training and classifying are accomplished, it's quite
> possible that the two activities will be done in different processes.  For
> example, I am currently experimenting with training using pop3proxy (well,
> still my offshoot proxytrainer at the moment) while classification is being
> done by hammiefilter run from procmail.  This implies a need to lock the
> shelve/pickle file used to store the training info.  Seems to me we need to
> (be able to) lock the shelve/pickle file.  The only lock facility which
> seems cross-platform enough for this application is the set of flags used by
> os.open().  To lock the database you'd have to check/create a lock file
> related (namewise) to the actual database file.  Has anyone given this any
> thought?

I use the following code in my programs.  Programs start with creating
an instance of this class, and end by calling the close method.

As far as I know, the safest way to do locking if you also have NFS
partitions is to try to link to the lock file, so that is the
technique I use.

import os, time
import spambayes.Options
import spambayes.hammie

class error(Exception):
    pass

class HammieFilter(object):
    def __init__(self):
        dbname = spambayes.Options.options.hammiefilter_persistent_storage_file
        dbname = os.path.expanduser(dbname)
        usedb = spambayes.Options.options.hammiefilter_persistent_use_database
        tmplock = '%s.lock%d' % (dbname, os.getpid())
        self.lockfile = '%s.lock' % dbname
        open(tmplock, 'w').close()
        for i in range(5):
            if i > 0:
                time.sleep(5)
            try:
                os.link(tmplock, self.lockfile)
            except OSError:
                pass
            else:
                break
        else:
            os.unlink(tmplock)
            raise error, 'Database locked'
        os.unlink(tmplock)
        self.hammie = spambayes.hammie.open(dbname, usedb, 'c')

    def train(self, msg, is_spam):
        self.hammie.train(msg, is_spam)

    def untrain(self, msg, is_spam):
        self.hammie.untrain(msg, is_spam)

    def score(self, msg, evidence = False):
        return self.hammie.score(msg, evidence)

    def close(self):
        self.hammie.store()
        os.unlink(self.lockfile)


-- Sjoerd Mullender <sjoerd at acm.org>



More information about the Spambayes mailing list