[scikit-learn] scikit-learn 1 - pytest - multiprocessing Pool - hangs?

Norbert Preining norbert at preining.info
Wed Dec 8 23:57:33 EST 2021


Dear all,

I am trying to track down a strange behaviour in one of our (Fujitsu)
library we are planning to open source. In preparation for that, I am
trying to bring it into a state that it works with scikit-learn >= 1.

But, some of our tests fail when running in parallel mode. But they
only fail when running under pytest, but NOT when running under python.

The library code contains

	def fit(self, X, y=None):
	    ...
	    p = multiprocessing.Pool()
	    ret = _reduce(
	        p.map(....))

Now what happens is that with scikit-learn 1(.0.1), the code hangs
forever. I adjusted the code also so that the pool definition is not in
the fit function, but in the __init__ function, and saved into self, but
that didn't help either.

When interrupted, pytest gives:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/threading.py:312: KeyboardInterrupt
(to show a full traceback on KeyboardInterrupt use --full-trace)
================================================ 1 passed, 2 warnings in 273.84s (0:04:33) =================================================
Exception ignored in: <function Pool.__del__ at 0x7ff72f31b9d0>
Traceback (most recent call last):
  File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/pool.py", line 268, in __del__
    self._change_notifier.put(None)
  File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/queues.py", line 378, in put
    self._writer.send_bytes(obj)
  File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
    self._send(header + buf)
  File "/home/norbert/.pyenv/versions/3.9.6/lib/python3.9/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)


While when running under python testfile.py all goes well.


I have tested the following combinations:
* scikit-learn 0.23.*, python 3.8 and python 3.9 => works
* scikit-learn 0.24.*, python 3.8 and python 3.9 => works
* scikit-learn 1.0.1,  python 3.8 and python 3.9 => fails

I don't really understand where scikit-learn comes into the play here,
so I wanted to ask whether someone here has an idea.

Thanks for any suggestion


Norbert

--
PREINING Norbert                              https://www.preining.info
Fujitsu Research  +  IFMGA Guide  +  TU Wien  +  TeX Live  + Debian Dev
GPG: 0x860CDC13   fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13


More information about the scikit-learn mailing list