
On Mon, 2010-03-01 at 12:35 +1300, Greg Ewing wrote:
Yes, although that would then incur higher stat overheads for people distributing .pyc files. There doesn't seem to be a way of pleasing everyone.
This is all assuming that the extra stat calls are actually a problem. Does anyone have any evidence that they would really take significant time compared to loading the module? Once you've looked for one file in a given directory, looking for another one in the same directory ought to be quite fast, since all the relevant directory blocks will be in the filesystem cache.
We've done a bunch of testing in bzrlib. Basic things are: - statting /is/ expensive *if* you don't use the result. - loading code is the main cost *once* you have a hot disk cache Specifically, stats for files that are *not present* incur page-in costs for the dentries needed to determine the file is absent. In the special case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit the same pages and don't incur additional page in costs. (you'll hit the same page in most file systems when you look for the second and third entries). In most file systems stats for files that *are present* also incur a page-in for the inode of the file. If you then do not read the file, this is I/O that doesn't really gain anything. Being able to disable .py file usage completely - so that only foo.pyc and foo/__init__.pyc are probed for, could have a very noticable change in the cold cache startup time. # Startup time for bzr (cold cache): $ drop-caches $ time bzr --no-plugins revno 5061 real 0m8.875s user 0m0.210s sys 0m0.140s # Hot cache $ time bzr --no-plugins revno 5061 real 0m0.307s user 0m0.250s sys 0m0.040s (revno is a small command that reads a small amount of data - just enough to trigger demand loading of the core repository layers and so on). strace timings for those two operations: cold cache: $ strace -c bzr --no-plugins revno 5061 % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 56.34 0.040000 76 527 read 28.98 0.020573 9 2273 1905 open 14.43 0.010248 14 734 625 stat 0.15 0.000107 0 533 fstat ... hot cache: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.10 0.000368 92 4 getdents 19.49 0.000159 0 527 read 16.91 0.000138 1 163 munmap 10.05 0.000082 2 54 mprotect 8.46 0.000069 0 2273 1905 open 0.00 0.000000 0 8 write 0.00 0.000000 0 367 close 0.00 0.000000 0 734 625 stat ... Cheers, Rob