On Wed, Nov 17, 2021 at 11:36 AM Jeremie du Boisberranger <jeremie.du-boisberranger@inria.fr> wrote:
Hi,
On 17/11/2021 11:26, Matthew Brett wrote:
> Hi,
>
> On Wed, Nov 17, 2021 at 10:00 AM Ralf Gommers <ralf.gommers@gmail.com> wrote:
>>
>>
>> On Wed, Nov 17, 2021 at 10:13 AM Matthew Brett <matthew.brett@gmail.com> wrote:
>>> Hi,
>>>
>>> On Wed, Nov 17, 2021 at 5:57 AM Ralf Gommers <ralf.gommers@gmail.com> wrote:
>>> [...]
>>>> - Use `threadpoolctl` (see the README at https://github.com/joblib/threadpoolctl for how)
>>>> - Set an environment variable to control the behavior, e.g. `OPENBLAS_NUM_THREADS`
>>>> - Rebuild the `libopenblas` we bundle in the wheel to have a max number of threads of 1, 2, or 4.
>>>>
>>>> SciPy doesn't have a `threadpoolctl` runtime dependency, and it doesn't seem desirable to add one just for this issue. Note though that gh-14441 aims to add it as an _optional_ dependency to improve test suite parallelism, and longer term we perhaps do want that dependency. Also, scikit-learn has a hard dependency on it, so many users will already have it installed.
>>> Is threadpool a heavy dependency? Am I right in thinking this is the
>>> most satisfying of the solutions, if available?
>>
>> Yes, I believe so. Of course it's hard to be 100% sure, because the root cause isn't well understood.
>>
>>> If it's not heavy,
>>> and it is a good solution - that seems like the right way to go.
>>
>> It's not heavy, it's a small-ish pure Python library with no dependencies. It can cause problems of course because it's trying to do something nontrivial - but that's true of code we write as well. For the minimal purpose we need it (setting default number of threads for SciPy, and probably also for NumPy if it uses OpenBLAS) it should be fine.
>>
>> I think the main concern is simply having any new external dependency - so far NumPy has zero (on PyPI at least) and SciPy has one (namely NumPy). Any new runtime dependency inevitably is going to result in problems at some point. Which is why we avoid them, so it's not a minor decision. I'd also be hesitant to do that in 1.7.3 rather than in 1.8.0
> I understand the general reluctance - but, given the state of
> packaging tools - a single pure-Python package doesn't sound like a
> significant worry....
threadpoolctl is designed to solve precisely this kind of issue:
>>> controller =
threadpoolctl.ThreadpoolController().select(internal_api="openblas")
>>> with controller.limit(limits=1):
... # single threaded blas call here
I understand that adding a hard dependency on a package that has
currently only numpy as dependency is not desirable. Would vendoring it
be an alternative ?
Thanks for the feedback Jérémie. If we vendor it, and we get two copies of threadpoolctl floating around that are both being used at the same time, will that work fine?
It's true that it might have some unexpected behavior, in a nested setting for instance. Let's say scikit-learn sets a threadpoolctl context and call some scipy function inside that also uses threadpoolctl. It's not clear to me yet if everything will work smoothly.
Moreover, we could have set the limits to 1 in sklearn to avoid oversubscription but you decided to set the limit to 4 in scipy, then we would end up with oversubscription in sklearn (this case has nothing to do with vendoring or not however).
Jérémie du Boisberranger