[Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

Larry Hastings larry at hastings.org
Wed Sep 19 19:54:52 EDT 2018

On 09/19/2018 03:08 PM, Terry Reedy wrote:
> If Python usually used derived stdlib code, but could optionally use 
> the original .py files via a command-line switch, experimenting with 
> changes to .py files would be easier.

When Carl described the patch to me, he said there was already a switch 
in there somewhere to do exactly that.  I don't remember if it was 
command-line, it might have been an environment variable.  (I admit I 
didn't go hunting for it--I didn't need it to test the patch itself, and 
I had enough to do.)  Regardless, we would definitely have that 
functionality in before the patch would ever be considered for merging.

We talked about it last week at the core dev sprint, and I thought about 
it some more.  As a result here's the behavior I propose.  I'm going to 
call the process "freezing" and the result "frozen modules", even though 
that's an already-well-overused name and I hope we'll pick something 
else before it gets merged.

    First, .py files that get frozen will have their date/time stamps
    set to a known value, both as part of the tarball / zip file, and
    when installed (a la "make install", the Win32 installer, etc). 
    There are some constraints on this; we distribute Python via .zip
    files, and .zip only supports 2 second resolution for date/time
    stamps.  So maybe something like this: the date is the approximate
    date of the release, and the time is the version number (e.g.
    03:08:00 for all 3.8.x releases).

    When attempting to load a frozen Python module, Python will stat the
    .py file.  If the date/time and size match what we expected, Python
    will use the frozen module.  Otherwise it'll fall back to
    conventional behavior, including supporting .pyc files.

    There will also be a switch (command-line? environment variable?
    compile-time flag? all three?) for people who control their
    environments where you can skip the .py file and use the frozen
    module every time.

In short: correctness by default, and more speed available if you know 
it's safe for your use case.  Use of the optimization is intentionally a 
little fragile, to ensure correctness.



p.s. Why not 03:08:01 for 3.8.1?  That wouldn't be stored properly in 
the .zip file with its only-two-second resolution.  And multiplying the 
tertiary version number by 2--or 10, etc--would be surprising.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180919/5f82645d/attachment.html>

More information about the Python-Dev mailing list