[New-bugs-announce] [issue36867] Make semaphore_tracker track other system resources

Pierre Glaser report at bugs.python.org
Thu May 9 13:36:01 EDT 2019


New submission from Pierre Glaser <pierreglaser at msn.com>:

Hi all,

Olivier Grisel, Thomas Moreau and myself are currently working on increasing
the range of action of the semaphore_tracker in Python.

multiprocessing.semaphore_tracker is a little known module, that launches a
server process used to track the life cycle of semaphores created in a python
session, and potentially cleanup those semaphores after all python processes of
the session terminated. Normally, python processes cleanup semaphores they
create. This is however not not guaranteed if the processes get violently
interrupted (using for example the bash command "killall python")

A note on why the semaphore_tracker was introduced: Cleaning up semaphores
after termination is important because the system only supports a limited
number of named semaphores, and they will not be automatically removed till the
next reboot.

Now, Python 3.8 introduces shared memory segments creation. Shared memory is
another sensitive global system resource. Currently, unexpected termination of
processes that created memory segments will result in leaking those memory
segments. This can be problematic for large compute clusters with many users
and that are rebooted rarely.

For this reason, we expanded the semaphore_tracker to also track shared memory
segments, and renamed it resource_tracker. Shared memory segments get
automatically tracked by the resource tracker when they are created. This is a
first, self-contained fix. (1)

Additionally, supporting shared memory tracking led to a more generic design
for the resource_tracker. The resource_tracker can be now easily extended
to track arbitrary resource types.
A public API could potentially be exposed for users willing to track other
types.  One for example may want to add tracking for temporary folders creating
during python sessions.  Another use case is the one of joblib, which
is a widely-used parallel-computing package, and also the backend of
scikit-learn. Joblib relies heavily on memmapping. A public API could extend
the resource_tracker to track memmap-ed objects with very little code.

Therefore, this issue serves two purposes:
- referencing the semaphore_tracker enhancement mentioned in (1)
- discussing a potentially public resource_tracker API.

----------
components: Library (Lib)
messages: 341987
nosy: pablogsal, pierreglaser, pitrou
priority: normal
severity: normal
status: open
title: Make semaphore_tracker track other system resources
type: resource usage
versions: Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36867>
_______________________________________


More information about the New-bugs-announce mailing list