On Tue, 2023-08-29 at 08:01 +0000, Nicolas Holzschuch wrote:
> Hello,
>
> This is my first post to this group; I'd like to start by expressing
> my appreciation for the amazing work in developing and maintaining
> Numpy.
>
> I have a question. Numpy has quite a lot of static local variables
> (variables defined as static inside a function, like this
> (core/src/multiarraymodule.c, line 4483):
> if (raise_exceptions) {
> static PyObject *too_hard_cls = NULL;
> /* ... */
> }
>
> I understand that these variables provide local caching and are
> important for efficiency. They do however cause some issues when
> dealing with multiple subinterpreters, where the static local
> variable might have been initialized by one of the subinterpreters,
> and is not reset when accessed by another subinterpreter.
> More globally, they cannot be reset when the Numpy module is
> released, and thus will likely cause an issue if it is reloaded after
> being released.
Right, but in the end these caches are there for a reason (or almost
all), and just removing them does not seem acceptable to me.
However, there are better ways to solve this. You can move it into
module state. In the vast majority of cases that should not be hard:
The patterns are known. In a few cases it may be harder but I believe
CPython offers decent solutions now (not sure how it looks like).
I had for a long time hoped for the HPy drive will solve this, but
there is no reason to wait for it.
In any case, contributions to this effect are very much welcome, I have
been hoping they would come for a long time, but I am not excited about
just removing the "static".
- Sebastian
>
> I have seen the issue mentionned in at least one pull request:
> https://github.com/numpy/numpy/pull/15169 and in several issues. If I
> understand correctly, the issue is not considered as important
> because subinterpreters are not yet prominent in CPython, static
> local variables provide an important service in caching data locally
> (instead of exposing these variables globally). So the benefits
> outweigh the costs and risks (that would be a huge change to the code
> base).
>
> I happen to maintain, compile and run a version of Python on iOS (
> https://github.com/holzschu/a-shell/ or
> https://apps.apple.com/us/app/a-shell/id1473805438), where I have to
> remove all these static local variables, because of the specificity
> of the platform (in order to run Python multiple times, I have to
> release and reset all modules). Right now, I'm maintaining the
> changes to the code base in a separate branch (
> https://github.com/holzschu/numpy/) and not necessarily in a very
> clean way.
>
> With the recent renewed interest in subinterpreters, I was wondering
> if there was a way I could contribute these changes back to the main
> numpy branch. I would have to clean up the code, obviously, and
> probably get guidance on how to do it cleanly, but the first question
> is: would there be an interest, or is that something I should keep in
> my separate branch?
>
> > From a technical point of view, about 80% of these static local
> > variables are just before a call to npy_cache_import(), and the
> > most efficient way to do it (in terms of lines of code) is just to
> > remove the part where npy_cache_import uses the static local
> > variable. You pay a price in performance, but gain in usability.
>
> Best regards,
> Nicolas Holzschuch
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-leave@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebastian@sipsolutions.net
>
This was discussed at the last community meeting. We are open to the idea, but would like to see how it works out in practice. In particular, what the code looks like.
Chuck