On Tue, Apr 21, 2020 at 10:24 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Tue, 2020-04-21 at 16:21 +0200, Victor Stinner wrote:
I fail to follow your logic. When the asyncio PEP was approved, I don't recall that suddenly the whole Python community started to rewrite all projects to use coroutines everywhere. I tried hard to replace eventlet with asyncio in OpenStack and I failed because such migration was a very large project with dubious benefits (people impacted by eventlet issues were the minority).
Sure, but this is very different. You can still use NumPy in a project using asyncio. You are _not_ able to use NumPy in a project using subinterpreters.
True. Is that a short-term problem? I don't know. A long-term problem? Definitely. So it will have to be addressed at some point. The biggest concern here is what is the resulting burden on extension authors and what can we do to help mitigate that. The first step is to understand what that burden might entail.
Right now, I have to say as soon as the first bug report asking for this is opened and tells me: But see PEP 554 you should support it! I would be tempted to put on the NumPy Roadmap/Vision that no current core dev will put serious efforts into subinterpreters. Someone is bound to be mad.
Yeah. And I don't want to put folks in the position that they get fussed at for something like this. This isn't a ploy to force projects like numpy to fix their subinterpreter support. My (honest) question is, how many folks using subinterpreters are going to want to use numpy (or module X) enough to get mad about it before the extension supports subinterpreters? What will user expectations be when it comes to subinterpreters? We will make the docs as clear as we can, but there are plenty of users out there that will not pay enough attention to know that most extension modules will not support subinterpreters at first. Is there anything we can do to mitigate this impact? How much would it help if the ImportError for incompatible modules give a clear (though lengthier) explanation of the situation?
Basically, if someone wants it in NumPy, I personally may expect them to be prepared to invest a year worth of good dev time [1]. Maybe that is pessimistic, but your guess is as good as mine. At normal dev-pace it will be at least a few years of incremental changes before NumPy might be ready (how long did it take Python?)?
The PEP links to NumPy bugs, I am not sure that we ever fixed a single one. Even if, the remaining ones are much larger and deeper. As of now, the NumPy public API has to be changed to even start supporting subinterpreters as far as I aware [2]. This is because right now we sometimes need to grab the GIL (raise errors) in functions that are not passed GIL state.
What do you expect to have to change? It might not be as bad as you think...or I suppose it could be. :) Keep in mind that subinterpreter support means making sure all of the module's global state is per-interpreter. I'm hearing about things like passing around GIL state and using the limited C-API. None of that should be a factor.
This all is not to say that this PEP itself doesn't seem harmless. But the _expectation_ that subinterpreters should be first class citizens will be a real and severe transition burden. And if it does not, the current text of the PEP gives me, as someone naive about subinterpreters, very few reasons why I should put in that effort or reasons to make me believe that it actually is not as bad a transition as it seems.
Yeah, the PEP is very light on useful information extension module maintainers. What information do you think would be most helpful?
Right now, I would simply refuse to spend time on it. But as Nathaniel said, it may be worse if I did not refuse and in the end only a handful of users get anything out of my work: The time is much better spend elsewhere. And you, i.e. CPython will spend your "please fix your C- extension" chips on subinterpreters. Maybe that is the only thing on the agenda, but if it is not, it could push other things away.
Good point.
Reading the PEP, it is fuzzy on the promises (the most concrete I remember is that it may be good for security relevant reasons), which is fine, because the goal is "experimentation" more than use?
The PEP is definitely lacking clear details on how folks might use subinterpreters (via the proposed module). There are a variety of reasons. I originally wrote the PEP mostly as "let's expose existing functionality more broadly", with the goal of getting it into folks' hands sooner rather than later. My focus was mostly on the API. I didn't see a strong need to convince anyone that the feature itself was worth it (since it already existed). In many ways the PEP is a side effect of my efforts to achieve a good multi-core Python story (via a per-interpreter GIL). All the relevant parties in that effort saw PEP 554 as worth it for that, so there wasn't much pressure to elaborate. At the same time, in the PEP I tried to avoid the GIL connection because I personally think subinterpreters offer a meaningful value already, even while they share the GIL, and that the PEP should stand on its own merits. I just didn't develop that aspect very far. On top of that, ultimately a PEP is a vehicle to aid a BDFL-delegate in making a decision, so I usually write PEPs with a specific audience. Put all that together, along with the limited time I have, and you get a lackluster explanation of the benefits from subinterpreters. At this point I'm torn on spending more time on fleshing that out. I want folks to get inspired, but at the same time I have to ration my open-source time carefully.
So if its more about "experimentation", then I have to ask, whether:
1. The PEP can state that more obviously, it wants to be provisionally/experimentally accept? So maybe it should even say that that extension modules are not (really) encouraged to transition unless they feel a significant portion of their users will gain.
That sounds fair. Then extension authors can point concerned users to the docs. The docs would say something like "Support for subinterpreters in extension modules should not be expected while this module is provisional." That isn't the only thing we should do to set user expectations properly, but it would help. (FWIW, for extension maintainers we would likely also have a link on the "interpreters" module docs pointing to the how-to-support-subinterpreters-in-an-extension-module page.)
2. The point about developing it outside of the Python standard lib should be considered more seriously. I do not know if that can be done, but C-API additions/changes/tweaks seem a bit orthogonal to the python exposure? So maybe it actually is possible?
It is possible. The low-level implementation is an extension module that uses the public C-API exclusively. Using the internal C-API would be a little dishonest. That said, the module is very closely tied to the CPython runtime, so it makes sense to keep it in the CPython repo. Furthermore, I don't see much value in keeping it out of the stdlib. Finally, a few of us have plans for using subinterpreters in CPython's test suite. I suppose we could keep the module's code and tests in the CPython repo, while releasing it separately, but I don't see the point.
As far as I can tell, nobody can or _should_ expect subinterpreters to actually run most general python code for many years.
Subinterpreters run all Python code right now. I'm guessing by "general python code" you are talking about the code folks are writing plus their dependencies. In that case, it's only with extension modules that we run into a problem, and we still don't know with how many of those it's a problem where it will take a lot of work. However, I *am* convinced that there is a non-trivial amount of work there and that it impacts large extension modules more than others. The question is, what can we do to mitigate the amount of work there?
Yes, its a chicken-and-egg problem, unless users start to use subinterpreters successfully, C-extensions should probably not even worry to transition. This PEP wants to break the chicken-and-egg problem to have a start, but as of now, as far as I can tell, it *must not* promise that it will ever work out.
I see where you're coming from but think it's highly unlikely at this point in CPython's life that subinterpreters (as a public feature) will ever go away. I'm definitely biased :) but it would be hard to justify removing a feature that has been publicly available for most of Python's existence. We know of a small number of public projects that already use subinterpreters through the C-API and expect there is a not insignificant number of private projects that rely on the feature. (That doesn't even consider the projects that tried to use subinterpreters but couldn't move past the limitations (which are mostly due to lack of use).)
So, I cannot judge the sentiment or subinterpreters. But it may be good to make it *painfully* clear what you expect from a project like NumPy in the next few years. Alternatively, make it painfully clear that you possibly even discourage us from spending time on it now, if its not straight forward. Those using this module are on their own for many years, probably even after success is proven.
Thanks for the feedback!
[1] As of now, the way I see it is that I could not even make NumPy (and probably many C extensions) work, because I doubt that the limited API has been exercised enough [2] and I am pretty sure it has holes. Also the PEP about passing module state around to store globals efficiently seems necessary, and is not in yet? (Again, trust: I have to trust you that e.g. what you do to make argument parsing not have overhead in argument clinic will be something that I can use for similar purposes within NumPy)
The relevant API is on its way. We'd be glad to pursue C-API improvements to facilitate efficient support for subinterpreters in extension modules. FWIW, often such improvements provide other benefits that are desirable on their own.
[2] I hope that we will do (many) these changes for other reasons within a year or so,
That's good to know. What specific changes are you planning for? Are there documents or issue #s you could point us at? This sort of thing may be helpful to identify how we can assist from the CPython side.
but they go deep into code barely touched in a decade. Realistically, even after the straight forward changes (such as using the new PEPs for module initialization), these may take up an additional few months of dev time (sure, get someone very good or does nothing else, they can do it much quicker maybe). So yes, from the perspective of a complex C-extension, this is probably very comparable to the 2to3 change (it happened largely before my time though).
Also good to know. I'm hopeful that nothing will ever be as disruptive as the 2/3 transition. :) I can see what you mean on a project-by-project basis though.
[3] E.g. I think I want an ExtensionMetaClass, a bit similar as an ABC, but I would prefer to store the data in a true C-slot fashion. The limited API cannot do MetaClasses correctly as far as I could tell and IIRC is likely even a bit buggy. Are ExtensionMetaClasses crazy? Maybe, but PySide does it too (and as far as I can tell, they basically get away with it by a bit of hacking and relying on Python implementation details.
That's an interesting idea. How much would that help with subinterpreter support? Regardless, you should consider bringing this up separately on the capi-sig mailing list. Thanks again! -eric