Re: [Python-Dev] A fast startup patch (was: Python startup time)

4 May 2018

      On 4 May 2018 at 06:13, Carl Shapiro  wrote:
...
Hello,
Yesterday Neil Schemenauer mentioned some work that a colleague of mine
(CCed) and I have done to improve CPython start-up time.  Given the recent
discussion, it seems timely to discuss what we are doing and whether it is
of interest to other people hacking on the CPython runtime.
There are many ways to reduce the start-up time overhead.  For this
experiment, we are specifically targeting the cost of unmarshaling heap
objects from compiled Python bytecode.  Our measurements show this specific
cost to represent 10% to 25% of the start-up time among the applications we
have examined.
Our approach to eliminating this overhead is to store unmarshaled objects
into the data segment of the python executable.  We do this by processing
the compiled python bytecode for a module, creating native object code with
the unmarshaled objects in their in-memory representation, and linking this
into the python executable.
When a module is imported, we simply return a pointer to the top-level
code object in the data segment directly without invoking the unmarshaling
code or touching the file system.  What we are doing is conceptually
similar to the existing capability to freeze a module, but we avoid
non-trivial unmarshaling costs.
This definitely seems interesting, but is it something you'd be seeing us
being able to take advantage of for conventional Python installations, or
is it more something you'd expect to be useful for purpose-built
interpreter instances? (e.g. if Mercurial were running their own Python,
they could precache the heap objects for their commonly imported modules in
their custom interpreter binary, regardless of whether those were standard
library modules or not).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia