[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

10 Jun 2020

      Hi,

I agree that embedding Python is an important use case and that we
should try to leak less memory and better isolate multiple
interpreters for this use case.

There are multiple projects to enhance code to make it work better
with multiple interpreters:

* convert C extension modules to multiphase initialization (PEP 489)
* move C extension module global variables (static ...) into a module state
* convert static types to heap types
* make free lists per interpreter
* etc.

From what I saw, the first side effect is that "suddenly", tests using
subinterpreters start to report new reference leaks. Examples of
issues and fixes:

* https://github.com/python/cpython/commit/18a90248fdd92b27098cc4db773686a2d10...:
reference leak in the init function of the select module
* https://github.com/python/cpython/commit/310e2d25170a88ef03f6fd31efcc899fe06...:
reference cycles with encodings and _testcapi misuses
PyModule_AddObject()
* https://bugs.python.org/issue40050: _weakref and importlib
* etc.

In fact, none of these bugs is not new. I checked for a few: bugs were
always there. It's just that previously, nobody paid attention to
these leaks.

Fixing subinterpreters helps to leak less memory even for the single
interpreter (embed Python) use case.

The problem is that Python never tried to clear everything at exit.
One way to see the issue is the number of references at exit using a
debug build, on the up-to-date master branch:

$ ./python -X showrefcount -c pass
[18645 refs, 6141 blocks]

Python leaks 18,645 references at exit. Some of the work that I listed
is tracked by https://bugs.python.org/issue1635741 which was created
in 2007: "Py_Finalize() doesn't clear all Python objects at exit".

Another way to see the issue is:

$ PYTHONMALLOC=malloc valgrind ./python -c pass
(...)
==169747== LEAK SUMMARY:
==169747==    definitely lost: 48 bytes in 2 blocks
==169747==    indirectly lost: 136 bytes in 6 blocks
==169747==      possibly lost: 700,552 bytes in 5,677 blocks
==169747==    still reachable: 5,450 bytes in 48 blocks
==169747==         suppressed: 0 bytes in 0 blocks

Python leaks around 700 KB at exit.

Even if you ignore the "run multiple interpreters in parallel" and PEP
554 use cases, enhancing code to better work with subinterpreters also
makes Python a better library to embed in applications and so is
useful.

Victor

Le mer. 10 juin 2020 à 04:46, Inada Naoki <songofacandy@gmail.com> a écrit :
...
On Tue, Jun 9, 2020 at 10:28 PM Petr Viktorin <encukou@gmail.com> wrote:
...
Relatively recently, there is an effort to expose interpreter creation &
finalization from Python code, and also to allow communication between
them (starting with something rudimentary, sharing buffers). There is
also a push to explore making the GIL per-interpreter, which ties in to
moving away from process-global state. Both are interesting ideas, but
(like banishing global state) not the whole motivation for
changes/additions.
Some changes for per interpreter GIL doesn't help sub interpreters so much.
For example, isolating memory allocator including free list and
constants between
sub interpreter makes sub interpreter fatter.
I assume Mark is talking about such changes.
Now Victor proposing move dict free list per interpreter state and the code
looks good to me.  This is a change for per interpreter GIL, but not
for sub interpreters.
https://github.com/python/cpython/pull/20645
Should we commit this change to the master branch?
Or should we create another branch for such changes?
Regards,
--
Inada Naoki  <songofacandy@gmail.com>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/L7JRFJLD...
Code of Conduct: http://python.org/psf/codeofconduct/
-- 
Night gathers, and now my watch begins. It shall not end until my death.

[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

Victor Stinner