[Python-Dev] startup time repeated? why not daemon

Thu Jul 20 17:16:10 EDT 2017

On Thu, Jul 20, 2017 at 11:53 AM, Jim J. Jewett <jimjjewett at gmail.com> wrote:
> I agree that startup time is a problem, but I wonder if some of the pain
> could be mitigated by using a persistent process.
>
> [snip]
>
> Is it too hard to create a daemon server?
> Is the communication and context switch slower than a new startup?
> Is the pattern just not well-enough advertised?

A couple years ago I suggested the same idea (i.e. "pythond") during a
conversation with MAL at PyCon UK.  IIRC, security and complexity were
the two major obstacles.  Assuming you use fork, you must ensure that
the daemon gets into just the right state.  Otherwise you're leaking
(potentially sensitive) info into the forked processes or you're
wasting cycles/memory.  Relatedly, at PyCon this year Barry and I were
talking about the idea of bootstrapping the interpreter from a memory
snapshot on disk, rather than from scatch (thus drastically reducing
the number of IO events).  From what I gather, emacs does (or did)
something like this.

The key thing for both solutions is getting the CPython runtime in a
very specific state.  Any such solution needs to get as much of the
runtime ready as possible, but only as much as is common to "most"
possible "python" invocations.  Furthermore, it has to be extremely
careful about security, e.g. protecting sensitive data and not
escalating privileges.  Having a python daemon that runs as root is
probably out of the question for now, meaning each user would have to
run their own daemon, paying for startup the first time they run
"python".  Aside from security concerns, there are parts of the
CPython runtime that depend on CLI flags and environment variables
during startup.  Each "python" invocation must respect those inputs,
as happens now, rather than preserving the inputs from when the daemon
was started.

FWIW, the startup-related code we landed in May (at PyCon), as a
precursor for Nick Coglan's PEP 432, improves the technical situation
somewhat by more clearly organizing startup of CPython's runtime (and
the main interpreter).  Also, as part of my (slowly progressing)
multi-core Python project, I'm currently working on consolidating the
(CPython) global runtime state into an explicit struct.  This will
help us reason better about the state of the runtime, allowing us to
be more confident about (and more able to implement) solutions for
isolating/protecting/optimizing the CPython runtime.   These efforts
have been all about improving the understandability of CPython's
runtime through more concise encapsulation.  The overarching goals
have been: reducing out maintenance burden, lowering the cost of
enhancement, improving the embedding story, and even enabling better
runtime portability (e.g. across threads, processes, and even hosts).
There is a direct correlation there with better opportunities to
improve startup time, including a python daemon.

-eric