[Python-Dev] Usage of the multiprocessing API and object lifetime

Antoine Pitrou solipsis at pitrou.net
Tue Dec 11 10:46:35 EST 2018


On Tue, 11 Dec 2018 16:33:54 +0100
Victor Stinner <vstinner at redhat.com> wrote:
> Le mar. 11 déc. 2018 à 16:14, Antoine Pitrou <solipsis at pitrou.net> a écrit :
> > What you are proposing here starts to smell like an anti-pattern to
> > me.  Python _is_ a garbage-collected language, so by definition, there
> > _are_ going to be resources that are automatically collected when an
> > object disappears.  If I'm allocating a 2GB bytes object, then PyPy may
> > delay the deallocation much longer than CPython.  Do you propose we add
> > a release() method to bytes objects to avoid this issue (and emit a
> > warning for people who don't call release() on bytes objects)?  
> 
> We are not talking about simple strings, but processes and threads.

Right, but do those have an impact on the program's correctness, or
simply on its performance (or memory consumption)?

> "user-visible consequences" are that resources are kept alive longer
> than I would expect. When I use a context manager, I expect that
> Python will magically releases everything for me.

I think there's a balancing act here: between "with pool" releasing
everything, and not taking too much time to execute the __exit__ method.
Currently, threads and processes may finish quietly between __exit__
and __del__, without adding significant latencies to your program's
execution.

> I prefer to explicitly manager resources like processes and threads
> since they can exit with error: killed by a signal, waitpid() failure
> (exit status already read by a different function), etc.

But multiprocessing.Pool manages them implicitly _by design_.  People
who want to manage processes explicitly can use the Process class
directly ;-)

Regards

Antoine.


More information about the Python-Dev mailing list