[Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?
larry at hastings.org
Wed Sep 19 19:54:52 EDT 2018
On 09/19/2018 03:08 PM, Terry Reedy wrote:
> If Python usually used derived stdlib code, but could optionally use
> the original .py files via a command-line switch, experimenting with
> changes to .py files would be easier.
When Carl described the patch to me, he said there was already a switch
in there somewhere to do exactly that. I don't remember if it was
command-line, it might have been an environment variable. (I admit I
didn't go hunting for it--I didn't need it to test the patch itself, and
I had enough to do.) Regardless, we would definitely have that
functionality in before the patch would ever be considered for merging.
We talked about it last week at the core dev sprint, and I thought about
it some more. As a result here's the behavior I propose. I'm going to
call the process "freezing" and the result "frozen modules", even though
that's an already-well-overused name and I hope we'll pick something
else before it gets merged.
First, .py files that get frozen will have their date/time stamps
set to a known value, both as part of the tarball / zip file, and
when installed (a la "make install", the Win32 installer, etc).
There are some constraints on this; we distribute Python via .zip
files, and .zip only supports 2 second resolution for date/time
stamps. So maybe something like this: the date is the approximate
date of the release, and the time is the version number (e.g.
03:08:00 for all 3.8.x releases).
When attempting to load a frozen Python module, Python will stat the
.py file. If the date/time and size match what we expected, Python
will use the frozen module. Otherwise it'll fall back to
conventional behavior, including supporting .pyc files.
There will also be a switch (command-line? environment variable?
compile-time flag? all three?) for people who control their
environments where you can skip the .py file and use the frozen
module every time.
In short: correctness by default, and more speed available if you know
it's safe for your use case. Use of the optimization is intentionally a
little fragile, to ensure correctness.
p.s. Why not 03:08:01 for 3.8.1? That wouldn't be stored properly in
the .zip file with its only-two-second resolution. And multiplying the
tertiary version number by 2--or 10, etc--would be surprising.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev