Python no longer leaks memory at exit
data:image/s3,"s3://crabby-images/f2cb6/f2cb6403da92e69ee6cc8c3fb58b22cdceb03681" alt=""
Hi, tl; dr Python no longer leaks memory at exit on the "python -c pass" command ;-) == Bug report == In 2007, the bpo-1635741 issue was reported on SourceForge: "Interpreter seems to leak references after finalization". This bug is 15 years old. It saw the bugs migration from SourceForge to Roundup (bugs.python.org) in 2007, the code migration from Subversion to Mercurial in 2009, and the second code migration from Mercurial to Git in 2016! Link to the main issue: https://bugs.python.org/issue1635741 Some people reported similar issues seen by WinCRT debug (bpo-6741), MSVC Debug C Runtime (bpo-32026), GCC AddressSanitzer (bpo-26888), or LeakSanitizer and Valgrind (bpo-21387). In general, "leaking memory" at Python exit doesn't matter, since the operating system releases all memory when a process completes. It matters when Python is embedded in an application. For example, Python can be used by a plugin which is unloaded and reloaded multiple times. It also matters for sub-interpreters. == Tedious task == In the last 3 years, many people helped fixing this old issue by converting static types to heap types, adding a module state to C extension modules, converting extensions to the multi-phase initialization API (PEP 489), fix many memory leaks, fix bugs, etc. When it became possible to cleanly unload an extension module (thanks for the multi-phase init), tests on sub-intepreters (which load and unload extension modules) showed many *old* reference leaks: all of them have been fixed! When a test is run in sub-interpreters, Python is able to detect leaks, whereas currently it doesn't check for memory leaks at Python exit. == Python 3.10 regressions == During the Python 3.10 development, we identified and fixed 3 major regressions caused by this work: * Converting static types to heap types make them mutable: the Py_TPFLAGS_IMMUTABLETYPE flag was added; 68 types use it in Python 3.11 (bpo-43908). * It became possible to create uninitialized objects by instanciating types (which should not be instanciated directly): the Py_TPFLAGS_DISALLOW_INSTANTIATION was added; 41 types use it in Python 3.11 (bpo-43916). For example, you cannot create an instance of type(sys.flags). * Heap types must implement the full GC protocol: Py_TPFLAGS_HAVE_GC flag, traverse and clear functions. Otherwise, the GC is unable to break reference cycles, whereas a type contains (multiple) strong references to itself (in the MRO tuple and in methods). All heap types have been fixed to fully implement the GC protocol (bpo-40217). == PEPs == The work relies on multiple PEPs: * PEP 489: Multi-phase extension module initialization * PEP 573: Module State Access from C Extension Methods * PEP 630: Isolating Extension Modules == Persons who helped fixing the issue == Incomplete list of people who helped to fix this issue: * Christian Heimes (modules: symtable, _hashlib, ,_random, grp, pwd, _queue, spwd, _struct, gc, _posixshmem, _posixsubprocess, select, resource) * Dong-hee Na (modules: _statistics, itertools, _heapq, _collections, _uuid, math, _stat, syslog, errno, fcntl, mmap, _dbm, _gdbm, _lzma, faulthandler, _bisect) * Eric Snow (PEP 573) * Erlend Egeberg Aasland (modules: _sqlite3, _sre, _multibytecodec) * Hai Shi (modules: _json, _codecs, _crypt, _contextvars, _abc, _bz2, _locale, audioop, _ctypes_test, _weakref, _locale) * Marcel Plch (PEP 573) * Martin von Löwis (PEP 3121) * Mohamed Koubaa (modules: sha256, multiprocessing, _winapi, _blake2, _sha3, _signal, _sha1, _sha512, _md5, zlib, _opcode, _overlapped, _curses_panel, termios, _sha256, scproxy, cmath, _lsprof, unicodedata) * Nick Coghlan (PEP 489, PEP 573) * Paulo Henrique Silva (modules: time, operator, _functools) * Petr Viktorin (PEP 489, PEP 573, PEP 630) * Stefan Behnel (PEP 489) * Victor Stinner (modules _string, mashal,_imp, _warnings, _thread) The work was scatted into many sub-issues, it was hard for me to track all persons who contributed, sorry about that! I only searched for "New changeset" in bpo-1635741. For me, it was a very pleasant collaborative work :-) Contributors wrote pull requests, I reviewed and merged them. Slowly, contributors started to review each others and shared some recipes for these tasks. == Close the very old bpo-1635741 issue == Today, in the main development branch, Python no longer leaks memory at exit for the simplest command: "python3 -c pass"! Using a Python debug build, you can check the "python -I -X showrefcount -c pass" command (if you get a negative reference count, see bpo-46449), or you can use a memory debugger like Valgrind. While the work is not 100% done, it's a great milestone (at least for me ;-)! 15 years after bpo-1635741 was reported, finally I can close it! Sadly, this bug will no see the bugs migration from Roundup (bugs.python.org) to GitHub ;-) There are still some static types which should be converted to heap types and some extensions which should be be ported to the multi-phase initialization API. But this work can be done in existing or new specific issues. Again, a *big* thanks to every single person who helped directly and indirectly on fixing this issue! Victor -- Night gathers, and now my watch begins. It shall not end until my death.
data:image/s3,"s3://crabby-images/495ad/495ad964f76d48076daafa0f0895f62fc1a00f0f" alt=""
Hey Victor First of all, I would like to thank Victor for all his efforts to resolve this issue. And I want to thank the other contributors to this issue as well! I am very happy to hear that we can close bpo-1635741. This has been a long mission requiring patience and I'm proud to finally be able to put an end to it. Let's move forward to the next step! Warm regards, Dong-hee 2022년 1월 28일 (금) 오전 12:37, Victor Stinner <vstinner@python.org>님이 작성:
Hi,
tl; dr Python no longer leaks memory at exit on the "python -c pass" command ;-)
== Bug report ==
In 2007, the bpo-1635741 issue was reported on SourceForge: "Interpreter seems to leak references after finalization".
This bug is 15 years old. It saw the bugs migration from SourceForge to Roundup (bugs.python.org) in 2007, the code migration from Subversion to Mercurial in 2009, and the second code migration from Mercurial to Git in 2016!
Link to the main issue: https://bugs.python.org/issue1635741
Some people reported similar issues seen by WinCRT debug (bpo-6741), MSVC Debug C Runtime (bpo-32026), GCC AddressSanitzer (bpo-26888), or LeakSanitizer and Valgrind (bpo-21387).
In general, "leaking memory" at Python exit doesn't matter, since the operating system releases all memory when a process completes. It matters when Python is embedded in an application. For example, Python can be used by a plugin which is unloaded and reloaded multiple times. It also matters for sub-interpreters.
== Tedious task ==
In the last 3 years, many people helped fixing this old issue by converting static types to heap types, adding a module state to C extension modules, converting extensions to the multi-phase initialization API (PEP 489), fix many memory leaks, fix bugs, etc.
When it became possible to cleanly unload an extension module (thanks for the multi-phase init), tests on sub-intepreters (which load and unload extension modules) showed many *old* reference leaks: all of them have been fixed! When a test is run in sub-interpreters, Python is able to detect leaks, whereas currently it doesn't check for memory leaks at Python exit.
== Python 3.10 regressions ==
During the Python 3.10 development, we identified and fixed 3 major regressions caused by this work:
* Converting static types to heap types make them mutable: the Py_TPFLAGS_IMMUTABLETYPE flag was added; 68 types use it in Python 3.11 (bpo-43908).
* It became possible to create uninitialized objects by instanciating types (which should not be instanciated directly): the Py_TPFLAGS_DISALLOW_INSTANTIATION was added; 41 types use it in Python 3.11 (bpo-43916). For example, you cannot create an instance of type(sys.flags).
* Heap types must implement the full GC protocol: Py_TPFLAGS_HAVE_GC flag, traverse and clear functions. Otherwise, the GC is unable to break reference cycles, whereas a type contains (multiple) strong references to itself (in the MRO tuple and in methods). All heap types have been fixed to fully implement the GC protocol (bpo-40217).
== PEPs ==
The work relies on multiple PEPs:
* PEP 489: Multi-phase extension module initialization * PEP 573: Module State Access from C Extension Methods * PEP 630: Isolating Extension Modules
== Persons who helped fixing the issue ==
Incomplete list of people who helped to fix this issue:
* Christian Heimes (modules: symtable, _hashlib, ,_random, grp, pwd, _queue, spwd, _struct, gc, _posixshmem, _posixsubprocess, select, resource) * Dong-hee Na (modules: _statistics, itertools, _heapq, _collections, _uuid, math, _stat, syslog, errno, fcntl, mmap, _dbm, _gdbm, _lzma, faulthandler, _bisect) * Eric Snow (PEP 573) * Erlend Egeberg Aasland (modules: _sqlite3, _sre, _multibytecodec) * Hai Shi (modules: _json, _codecs, _crypt, _contextvars, _abc, _bz2, _locale, audioop, _ctypes_test, _weakref, _locale) * Marcel Plch (PEP 573) * Martin von Löwis (PEP 3121) * Mohamed Koubaa (modules: sha256, multiprocessing, _winapi, _blake2, _sha3, _signal, _sha1, _sha512, _md5, zlib, _opcode, _overlapped, _curses_panel, termios, _sha256, scproxy, cmath, _lsprof, unicodedata) * Nick Coghlan (PEP 489, PEP 573) * Paulo Henrique Silva (modules: time, operator, _functools) * Petr Viktorin (PEP 489, PEP 573, PEP 630) * Stefan Behnel (PEP 489) * Victor Stinner (modules _string, mashal,_imp, _warnings, _thread)
The work was scatted into many sub-issues, it was hard for me to track all persons who contributed, sorry about that! I only searched for "New changeset" in bpo-1635741.
For me, it was a very pleasant collaborative work :-) Contributors wrote pull requests, I reviewed and merged them. Slowly, contributors started to review each others and shared some recipes for these tasks.
== Close the very old bpo-1635741 issue ==
Today, in the main development branch, Python no longer leaks memory at exit for the simplest command: "python3 -c pass"!
Using a Python debug build, you can check the "python -I -X showrefcount -c pass" command (if you get a negative reference count, see bpo-46449), or you can use a memory debugger like Valgrind.
While the work is not 100% done, it's a great milestone (at least for me ;-)! 15 years after bpo-1635741 was reported, finally I can close it! Sadly, this bug will no see the bugs migration from Roundup (bugs.python.org) to GitHub ;-)
There are still some static types which should be converted to heap types and some extensions which should be be ported to the multi-phase initialization API. But this work can be done in existing or new specific issues.
Again, a *big* thanks to every single person who helped directly and indirectly on fixing this issue!
Victor -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/E4C6TDNV... Code of Conduct: http://python.org/psf/codeofconduct/
data:image/s3,"s3://crabby-images/f3aca/f3aca73bf3f35ba204b73202269569bd49cd2b1e" alt=""
On Thu, Jan 27, 2022 at 8:40 AM Victor Stinner <vstinner@python.org> wrote:
tl; dr Python no longer leaks memory at exit on the "python -c pass" command ;-)
Thanks to all for the effort on this! Would it be worth adding a test to make sure we don't start leaking memory again? -eric
participants (4)
-
Dong-hee Na
-
Eric Snow
-
Simon Cross
-
Victor Stinner