[Mailman-Developers] race condition in locking ?

Harald Meland Harald.Meland@usit.uio.no
08 Feb 2000 03:58:47 +0100


[Thomas Wouters]

> On Mon, Feb 07, 2000 at 10:01:36PM +0100, Harald Meland wrote:
> 
> > My current CVS checkout is available ("live" :) under <URL:
> > http://www.uio.no/~hmeland/tmp/mailman-userdb/>, with <URL:
> > http://www.uio.no/~hmeland/tmp/mailman-userdb/Mailman/LockFile.py>
> > being of special interest in this discussion...
> 
> Ah, very cool. You'll find my current version of LockFile.py
> attached, for reference, as you asked.

Thanks, I've skimmed through most of it now.

> I think the link() solutions, as opposed to the rename() solution,
> is a better one, if you'll premit me.

Sure :)

> For one, you can check on lock ownership by statting your own
> lockfile, which should only be touched by you, instead of reading a
> possibly highly volatile lockfile. You can also doublecheck the
> lockfile inode, see if it is equal to your own lockfiles' inode, to
> confirm that you have the lock.

... the downside being that if, e.g., anyone tries to read the info in
the lockfile in the middle of a refresh(), they'll get a ValueError --
which your module seems to take care of, but it all looks a bit
convoluted to my eyes.  I especially dislike that lock() has to call
locked() _after_ the check on ST_NLINK, because further down in lock()
there's code that might break a lock it really shouldn't (if it
weren't for the nonatomic-write-of-lock-contents issue).

I also see some bugs that are fixed in my module but still present in
yours, but those are rather irrelevant to this discussion (as they can
easily be ported to your module if that's the one we'll go for).

Does anyone but Thomas and me have any opinions on which locking
scheme is preferrable?  I guess I don't have any very strong feelings
about the issue ;), so any sagely advise on the matter would be
welcome.

[ I've tried to model my version of the module as closely as possible
  after the way Exim does its lockfile locking, the idea being that
  Exim really should have been around long enough to have solved these
  things correctly.  Reading the longish comment from ca. line 1206
  and onwards in src/transports/appendfile.c of the Exim distribution
  proved very helpful to me.

  Of course, when it comes to lock stealing and other ugly (but
  needed, I guess) stuff, Exim couldn't provide much help. ]

> Ah, well, I couldn't find any parts of Mailman in the CVS tree that
> acually _called_ steal, so I couldn't figure out what it should do
> in borderline cases.

The only place I know of is cron/gate_news.

> However, reading your LockFile.py made me realise how to steal the
> lock. LockFile.steal() now does its utmost best to steal the lock,
> only failing if someone else is busy stealing the lock from us. It
> returns 1 for success, 0 for failure, mostly because I wanted it to
> signify that in some way.

... but the caller should note that steal() returning 1 does not
necessarily imply that it is the owner of the lock (anymore).  The
steal() stuff is bound to be full of race conditions, AFAICT -- so
whether it returns anything doesn't really matter (IMHO), as the
return code truly can't be trusted anyway.

Cheers,
-- 
Harald