[Python-Dev] Python startup time
solipsis at pitrou.net
Tue Oct 9 17:23:42 EDT 2018
On Tue, 9 Oct 2018 14:02:02 -0700
Gregory Szorc <gregory.szorc at gmail.com> wrote:
> Python 3.7 doesn't exhibit as much of a problem. But it is still there.
> A brief audit of the importer code and call stacks confirms it is the
> same problem - just less prevalent. Wall time execution of the test
> harness from Python 2.7 to Python 3.7 drops from ~37:43s to ~20:39.
> Overall kernel CPU time drops from ~75% to ~19%. And that wall time
> improvement is despite Python 3's slower process startup. So locking in
> the kernel is really a killer on Python 2.7.
Thanks for the detailed feedback.
> I hope someone finds this information useful to further improving
> [startup] performance. (And given that Python 3.7 is substantially
> faster by avoiding excessive readdir(), I wouldn't be surprised if this
> problem is already known!)
The macOS problem wasn't known, but the general problem of filesystem
calls was (in relation with e.g. networked filesystems).
Significant work went into improving Python 3 in that regard after the
import mechanism was rewritten in pure Python. Nowadays Python caches
the contents of all sys.path directories, so (once the cache is primed)
it's mostly a single stat() call per directory to check whether the
cache is up-to-date. This is not entirely free, but massively better
than what Python 2 did, which was to stat() many filename patterns in
each sys.path directory.
(of course, the fact that Python 3 imports many more modules at startup
mitigates the end result a bit)
As a sidenote, I was always shocked with how the Mercurial test suite
was architected. You're wasting so much time launching processes that
I wonder why you kept it that way for so long :-)
More information about the Python-Dev