[ python-Bugs-1599254 ] mailbox: other programs' messages can vanish without trace

SourceForge.net noreply at sourceforge.net
Fri Dec 15 15:06:32 CET 2006


Bugs item #1599254, was opened at 2006-11-19 11:03
Message generated for change (Comment added) made by akuchling
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1599254&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: David Watson (baikie)
Assigned to: A.M. Kuchling (akuchling)
Summary: mailbox: other programs' messages can vanish without trace

Initial Comment:
The mailbox classes based on _singlefileMailbox (mbox, MMDF, Babyl) implement the flush() method by writing the new mailbox contents into a temporary file which is then renamed over the original. Unfortunately, if another program tries to deliver messages while mailbox.py is working, and uses only fcntl() locking, it will have the old file open and be blocked waiting for the lock to become available. Once mailbox.py has replaced the old file and closed it, making the lock available, the other program will write its messages into the now-deleted "old" file, consigning them to oblivion.

I've caused Postfix on Linux to lose mail this way (although I did have to turn off its use of dot-locking to do so).

A possible fix is attached.  Instead of new_file being renamed, its contents are copied back to the original file.  If file.truncate() is available, the mailbox is then truncated to size.  Otherwise, if truncation is required, it's truncated to zero length beforehand by reopening self._path with mode wb+.  In the latter case, there's a check to see if the mailbox was replaced while we weren't looking, but there's still a race condition.  Any alternative ideas?

Incidentally, this fixes a problem whereby Postfix wouldn't deliver to the replacement file as it had the execute bit set.


----------------------------------------------------------------------

>Comment By: A.M. Kuchling (akuchling)
Date: 2006-12-15 09:06

Message:
Logged In: YES 
user_id=11375
Originator: NO

I'm testing the fix using two Python processes running mailbox.py, and my
test case fails even with your patch.  This is due to another bug, even in
the patched version.  

mbox has a dictionary attribute, _toc, mapping message keys to positions
in the file.  flush() writes out all the messages in self._toc and
constructs a new _toc with the new file offsets.  It doesn't re-read the
file to see if new messages were added by another process.

One fix that seems to work: instead of doing 'self._toc = new_toc' after
flush() has done its work, do self._toc = None.  The ToC will be
regenerated the next time _lookup() is called, causing a re-read of all the
contents of the mbox.  Inefficient, but I see no way around the necessity
for doing this.

It's not clear to me that my suggested fix is enough, though.  Process #1
opens a mailbox, reads the ToC, and the process does something else for 5
minutes.  In the meantime, process #2 adds a file to the mbox.  Process #1
then adds a message to the mbox and writes it out; it never notices process
#2's change.

Maybe the _toc has to be regenerated every time you call lock(), because
at this point you know there will be no further updates to the mbox by any
other process.  Any unlocked usage of _toc should also really be
regenerating _toc every time, because you never know if another process has
added a message... but that would be really inefficient.







----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2006-12-15 08:17

Message:
Logged In: YES 
user_id=11375
Originator: NO

The attached patch adds a test case to test_mailbox.py that demonstrates
the problem.  No modifications to mailbox.py are needed to show data loss.

Now looking at the patch...

File Added: mailbox-test.patch

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2006-12-12 16:04

Message:
Logged In: YES 
user_id=11375
Originator: NO

I agree with David's analysis; this is in fact a bug.  I'll try to look at
the patch.

----------------------------------------------------------------------

Comment By: David Watson (baikie)
Date: 2006-11-19 15:44

Message:
Logged In: YES 
user_id=1504904
Originator: YES

This is a bug.  The point is that the code is subverting the protection of
its own fcntl locking.  I should have pointed out that Postfix was still
using fcntl locking, and that should have been sufficient.  (In fact, it
was due to its use of fcntl locking that it chose  precisely the wrong
moment to deliver mail.)  Dot-locking does protect against this, but not
every program uses it - which is precisely the reason that the code
implements fcntl locking in the first place.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2006-11-19 15:02

Message:
Logged In: YES 
user_id=21627
Originator: NO

Mailbox locking was invented precisely to support this kind of operation.
Why do you complain that things break if you deliberately turn off the
mechanism preventing breakage?

I fail to see a bug here.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1599254&group_id=5470


More information about the Python-bugs-list mailing list