[ python-Bugs-1607951 ] mailbox.Maildir re-reads directory too often
SourceForge.net
noreply at sourceforge.net
Thu Dec 14 20:09:32 CET 2006
Bugs item #1607951, was opened at 2006-12-03 12:28
Message generated for change (Comment added) made by akuchling
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1607951&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Matthias Klose (doko)
Assigned to: A.M. Kuchling (akuchling)
Summary: mailbox.Maildir re-reads directory too often
Initial Comment:
[forwarded from http://bugs.debian.org/401395]
Various functions in mailbox.Maildir call self._refresh, which always re-reads the cur and new directories with os.listdir. _refresh should stat each of the two directories first to see if they changed. This cuts processing time of a series of lookups down by a factor of the number of messages in the folder, a potentially large number.
----------------------------------------------------------------------
>Comment By: A.M. Kuchling (akuchling)
Date: 2006-12-14 14:09
Message:
Logged In: YES
user_id=11375
Originator: NO
Stray thought: would it help if the patch stored the (mtime - 1sec)
instead of the mtime? Successive calls in the same second would then
always re-read the directories, but once the clock ticks to the next
second, re-reads would only occur if the directories have actually changed.
The check would be 'if new_mtime > self._new_mtime' instead of '=='.
Is this sort of mtime-based checking reliable on remote filesystems such
as NFS?
----------------------------------------------------------------------
Comment By: A.M. Kuchling (akuchling)
Date: 2006-12-13 08:38
Message:
Logged In: YES
user_id=11375
Originator: NO
By stat()'ing the directories, do you mean checking the mtime? I think
this isn't safe because of the limited resolution of mtime on filesystems;
ext3 seems to have a 1-second resolution for mtime, for example. This
means that _refresh() might read a directory, and if some other process
adds or deletes a message in the same second, _refresh() couldn't detect
the change. Is there some other property of directories that could be used
for a more reliable check?
The attached patch implements checking of mtime, but I don't recommend
applying it; it causes the test suite in test_mailbox.py to break all over
the place, because the process modifies mailboxes so quickly that the mtime
check doesn't notice the process's own changes.
I'll wait a bit for any alternative suggestion, and then close this bug as
"won't fix".
File Added: mailbox-mtime.patch
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1607951&group_id=5470
More information about the Python-bugs-list
mailing list