[Python-Dev] __file__
Robert Collins
robertc at robertcollins.net
Mon Mar 1 00:56:20 CET 2010
On Mon, 2010-03-01 at 12:35 +1300, Greg Ewing wrote:
>
> Yes, although that would then incur higher stat overheads for
> people distributing .pyc files. There doesn't seem to be a
> way of pleasing everyone.
>
> This is all assuming that the extra stat calls are actually
> a problem. Does anyone have any evidence that they would
> really take significant time compared to loading the module?
> Once you've looked for one file in a given directory, looking
> for another one in the same directory ought to be quite fast,
> since all the relevant directory blocks will be in the
> filesystem cache.
We've done a bunch of testing in bzrlib. Basic things are:
- statting /is/ expensive *if* you don't use the result.
- loading code is the main cost *once* you have a hot disk cache
Specifically, stats for files that are *not present* incur page-in costs
for the dentries needed to determine the file is absent. In the special
case of probing for $name.$ext1, ...$ext2, ...$ext3, you generally hit
the same pages and don't incur additional page in costs. (you'll hit the
same page in most file systems when you look for the second and third
entries).
In most file systems stats for files that *are present* also incur a
page-in for the inode of the file. If you then do not read the file,
this is I/O that doesn't really gain anything.
Being able to disable .py file usage completely - so that only foo.pyc
and foo/__init__.pyc are probed for, could have a very noticable change
in the cold cache startup time.
# Startup time for bzr (cold cache):
$ drop-caches
$ time bzr --no-plugins revno
5061
real 0m8.875s
user 0m0.210s
sys 0m0.140s
# Hot cache
$ time bzr --no-plugins revno
5061
real 0m0.307s
user 0m0.250s
sys 0m0.040s
(revno is a small command that reads a small amount of data - just
enough to trigger demand loading of the core repository layers and so
on).
strace timings for those two operations:
cold cache:
$ strace -c bzr --no-plugins revno
5061
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
56.34 0.040000 76 527 read
28.98 0.020573 9 2273 1905 open
14.43 0.010248 14 734 625 stat
0.15 0.000107 0 533 fstat
...
hot cache:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
45.10 0.000368 92 4 getdents
19.49 0.000159 0 527 read
16.91 0.000138 1 163 munmap
10.05 0.000082 2 54 mprotect
8.46 0.000069 0 2273 1905 open
0.00 0.000000 0 8 write
0.00 0.000000 0 367 close
0.00 0.000000 0 734 625 stat
...
Cheers,
Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100301/22f9c569/attachment.pgp>
More information about the Python-Dev
mailing list