Pursuant to my volunteering to implement Guido's plan to combine cmp.py, cmpcache.py, dircmp.py and dircache.py into filecmp.py, I did some investigating of dircache.py. I find it completely unreliable. On my NT box, the mtime of the directory is updated (on average) 2 secs after a file is added, but within 10 tries, there's always one in which it takes more than 100 secs (and my test script quits). My Linux box hardly ever detects a change within 100 secs. I've tried a number of ways of testing this ("this" being checking for a change in the mtime of the directory), the latest of which is below. Even if dircache can be made to work reliably and surprise-free on some platforms, I doubt it can be done cross-platform. So I'd recommend that it just get dropped. Comments? --------------------------------------------------- import os import sys import time d = os.getcwd() atimes = [] def test(): m = os.stat(d)[8] for i in range(10): fnm = 's%d.tmp' % i open(fnm,'w').write('dummy - delete me') for j in range(10000): newm = os.stat(d)[8] if newm != m: atimes.append(j*0.01) m = newm break time.sleep(0.01) else: print "At round %d, failed to detect add within %3.2f secs" % (i, j*0.01) break def report(): import operator if atimes: print "detect adds: min= %3.2f max= %3.2f avg= %3.2f" % (min(atimes), max(atimes), reduce(operator.add, atimes, 0.0)/len(atimes)) else: print "no successfully detected adds" test() report() - Gordon
Gordon McMillan wrote:
Pursuant to my volunteering to implement Guido's plan to combine cmp.py, cmpcache.py, dircmp.py and dircache.py into filecmp.py, I did some investigating of dircache.py.
I find it completely unreliable. On my NT box, the mtime of the directory is updated (on average) 2 secs after a file is added, but within 10 tries, there's always one in which it takes more than 100 secs (and my test script quits). My Linux box hardly ever detects a change within 100 secs.
I've tried a number of ways of testing this ("this" being checking for a change in the mtime of the directory), the latest of which is below. Even if dircache can be made to work reliably and surprise-free on some platforms, I doubt it can be done cross-platform. So I'd recommend that it just get dropped.
Comments?
Note that you'll have to flush and close the tmp file to actually have it written to the file system. That's why you are not seeing any new mtimes on Linux. Still, I'd suggest declaring it obsolete. Filesystem access is usually cached by the underlying OS anyway, so adding another layer of caching on top of it seems not worthwhile (plus, the OS knows better when and what to cache). Another argument against using stat() time entries for caching purposes is the resolution of 1 second. It makes the dircache.py unreliable per se for fast changing directories. The problem is most probably even worse for NFS and on Samba mounted WinXX filesystems the mtime trick doesn't work at all (stat() returns the creation time for atime, mtime and ctime). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 60 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
participants (2)
-
Gordon McMillan
-
M.-A. Lemburg