[ mailman-Patches-1045656 ] Race in LockFile.py resulting in extraneous log entries.

SourceForge.net noreply at sourceforge.net
Fri Oct 22 14:06:22 CEST 2004

Patches item #1045656, was opened at 2004-10-12 15:50
Message generated for change (Comment added) made by bwarsaw
You can respond by visiting: 

Category: mail delivery
Group: Mailman 2.1
Status: Open
Resolution: None
Priority: 3
Submitted By: Brian Greenberg (grnbrg)
Assigned to: Tokio Kikuchi (tkikuchi)
Summary: Race in LockFile.py resulting in extraneous log entries.

Initial Comment:
There is a race condition (that does not affect correct
operation of Mailman) in .../Mailman/LockFile.py.  This
race results in entries to the "locks" log file
(usually in pairs) that are unnecessary and confusing.

If there is a process holding a lock for a file, and a
process waiting for that lock to be freed, the
following sequence can occur:

1)  The waiting process executes os.link().  Since the
running process still has the lock, this fails with EEXIST.
2)  The running process releases the lock, and removes
the lock file.
3)  The waiting process proceeds with it's checks, and
checks self.__linkcount() which returns -1 due to an
ENOENT error.  This throws an error into the "locks"
log file.
4)  The waiting process continues processing, checks
it's timeout, and then checks to see if the lock
lifetime has expired.  The lifetime check is based on a
call to self.__releasetime(), which returns -1, due to
an ENOENT error.  This results in the lockfile being
declared stale, and self.__break() is called.  This
then throws a second error into the "locks" log file.

The attached patch checks for these ENOENT errors,
immediately restarting the loop when they are detected.
 It does not affect operation is the lock file exists
and is held by another process.

Brian Greenberg.


>Comment By: Barry A. Warsaw (bwarsaw)
Date: 2004-10-22 08:06

Logged In: YES 

I haven't had time to look at this in detail, but I'll just
include a general warning that LockFile.py is /very/
fragile.  It's also critical to the proper operation of
Mailman.  So any changes, even seemingly innocent onces,
must be very thoroughly tested.  My inclination would be to
reject any change that wasn't absolutely necessary, but I
won't veto this change if it is well-tested and fixes a
verified bug (tho' it sounds like tkikuchi can't reproduce it).


Comment By: Tokio Kikuchi (tkikuchi)
Date: 2004-10-22 04:26

Logged In: YES 

Will you be more specific how this race occurs? My
installation on FreeBSD doesn't looks like show this symptom
in locks log file.
I tested with two terminals invoking python -i bin/withlist
testlist and try to lock with m.Lock() in both the terminal.
One of term is forced to wait until the I type m.Unlock() in
the other term. I then looked into logs dir but nothing avail.


Comment By: Brian Greenberg (grnbrg)
Date: 2004-10-15 10:22

Logged In: YES 

This would fix this bug, as long as CLOCK_SLOP is large enough.

But I don't think it's the Right Thing, as you are returning
a result that says that the lockfile has a lifetime, when it
in fact simply no longer exists.  For a function whose
purpose is to report the mtime of the master lockfile, -1 is
the correct response if the file doesn't exist.  The code
that calls that function should check for and interpret that


Comment By: SHIGENO Kazutaka (shigeno)
Date: 2004-10-14 20:11

Logged In: YES 

How about changing as follows about self.__releasetime()?

@@ -448,7 +448,7 @@
             return os.stat(self.__lockfile)[ST_MTIME]
         except OSError, e:
             if e.errno <> errno.ENOENT: raise
-            return -1
+            return time.time()

     def __linkcount(self):


You can respond by visiting: 

More information about the Mailman-coders mailing list