On Tue, 9 Oct 2018 14:02:02 -0700 Gregory Szorc email@example.com wrote:
Python 3.7 doesn't exhibit as much of a problem. But it is still there. A brief audit of the importer code and call stacks confirms it is the same problem - just less prevalent. Wall time execution of the test harness from Python 2.7 to Python 3.7 drops from ~37:43s to ~20:39. Overall kernel CPU time drops from ~75% to ~19%. And that wall time improvement is despite Python 3's slower process startup. So locking in the kernel is really a killer on Python 2.7.
Thanks for the detailed feedback.
I hope someone finds this information useful to further improving [startup] performance. (And given that Python 3.7 is substantially faster by avoiding excessive readdir(), I wouldn't be surprised if this problem is already known!)
The macOS problem wasn't known, but the general problem of filesystem calls was (in relation with e.g. networked filesystems).
Significant work went into improving Python 3 in that regard after the import mechanism was rewritten in pure Python. Nowadays Python caches the contents of all sys.path directories, so (once the cache is primed) it's mostly a single stat() call per directory to check whether the cache is up-to-date. This is not entirely free, but massively better than what Python 2 did, which was to stat() many filename patterns in each sys.path directory.
(of course, the fact that Python 3 imports many more modules at startup mitigates the end result a bit)
As a sidenote, I was always shocked with how the Mercurial test suite was architected. You're wasting so much time launching processes that I wonder why you kept it that way for so long :-)