[Python-Dev] Python startup optimization: script vs. service

Neil Schemenauer nas at arctrix.com
Tue Oct 17 13:39:36 EDT 2017

Christian Heimes <christian at python.org> wrote:
> That approach could work, but I think that it is the wrong
> approach. I'd rather keep Python optimized for long-running
> processes and introduce a new mode / option to optimize for
> short-running scripts.

Another idea is to run a fake transasaction through the process
before forking.  That will "warm up" things so that most of the lazy
init is already done.

After returning from the core sprint, I have gotten over my initial
enthusiam for my "lazy module defs" idea.  It is just too big of a
change for Python to accept that this point.  I still hope there
would be a way to make LOAD_NAME/LOAD_GLOBAL trigger something like
__getattr__().  That would allow libraries that want to aggressively
do lazy-init to do so in the clean way.

The main reason that Python startup is slow is that we do far too
much work on module import (e.g. initializing data structures that
never get used).  Reducing that work will almost necessarily impact
pre-fork model programs (e.g. they expect the init to be done before
the fork).

As someone who uses that model heavily, I would still be okay with
the "lazification" as I think there are many more programs that
would be helped vs the ones hurt.  Initializing everything that your
program might possibibly need right at startup time doesn't seem
like a goal to strive for.  I can understand if you have a different
opinion though.

A third approach would be to do more init work at compile time.
E.g. for re.compile, if the compiled result could be stored in the
.pyc, that would eliminate a lot of time for short scripts and for
long-running programs.  Some Lisp systems have "compiler macros".
They are basically a hook to allow programs to do some work before
the code is sent to the compiler.  If something like that existed in
Python, it could be used by re.compile to generate a compiled
representation of the regex to store in the .pyc file.  That kind of
behavior is pretty different than the "there is only runtime" model
that Python generally tries to follow.

Spit-ball idea, thought up just now:

    PAT = __compiled__(re.compile(...))

The expression in __compiled__(..) would be evaluated by the
compiler and the resulting value would become the value to store in
th .pyc.  If you are running the code as the script, __compiled__
just returns its argument unchanged.



More information about the Python-Dev mailing list