On Fri, Apr 17, 2020 at 2:59 PM Nathaniel Smith <njs@pobox.com> wrote:
I think some perspective might be useful here :-).
The last time we merged a new concurrency model in the stdlib, it was asyncio.
[snip]
OTOH, AFAICT the new concurrency model in PEP 554 has never actually been used, and it isn't even clear whether it's useful at all.
Perhaps I didn't word things quite right. PEP 554 doesn't provide a new concurrency model so much as it provides functionality that could probably be used as the foundation for one. Ultimately the module proposed in the PEP does the following: * exposes the existing subinterpreters functionality almost as-is * provides a minimal way to pass information between subinterpreters (which you don't need in C but do in Python code) * adds a few minor conveniences like propagating exceptions and making it easier to share buffers safely So the comparison with asyncio isn't all that fair.
Designing useful concurrency models is *stupidly* hard. And on top of that, it requires major reworks of the interpreter internals +
Nearly all of the "rework" is worth doing for other good reasons. Furthermore, I'd call at most a few of the reworks "major".
disrupts the existing C extension module ecosystem -- which is very different from asyncio, where folks who didn't use it could just ignore it.
The concern is users opening issues saying "your extension won't work in subinterpreters", right? (You brought up the possible burden on extension authors in discussions several years ago, for which I'm still grateful.) How is that different from any other feature, new or not? "Your library doesn't provide an awaitable API." "Your library doesn't support pickling." "Your library doesn't implement the buffer protocol." PEP 554 doesn't introduce some new kind of impact. So (unless I've misunderstood), by your reasoning we wouldn't add any new features for which library authors might have to change something. Are you suggesting that the burden from PEP 554 will be larger than saying "then don't try to use our extension in subinterpreters"? Are you concerned about users reporting bugs that surface when an incompatible extension is used in a subinterpreter? That shouldn't be a problem if we raise ImportError if an extension that does not support PEP 489 is imported in a subinterpreter. FWIW, the impact to extension authors is the one thing about which I still have any meaningful uncertainty and worry. Various people have explained to me how it won't be a big problem, but I'm still nervous about it. I just don't think my worry is large than the actual risk (and possible cost).
So to me, it's kind of shocking that you'd even bring up the possibility of merging PEP 554 as-is, without even a provisional marker.
Unsurprisingly it isn't shocking to me. :) From my point of view it seems okay. However, I'll be the first to recognize how hard it can be to see things from a different perspective. Hence I started this thread. :)
And if it's possible for it to live on PyPI, then why would we even consider putting it into the stdlib?
Given that we're exposing functionality of the CPython runtime I don't see the point in keeping this out of the stdlib. Furthermore, there are use cases to explore for subinterpreters in our test suite that we can address only if the "interpreters" module is part of the CPython repo. So why keep it hidden away and then publish the exact same thing on PyPI?
Personally, I'm still leaning towards thinking that the whole subinterpreter project is fundamentally flawed, and that on net we'd be better off removing support for them entirely.
By which I imagine you mean drop the subinterpreters API and not actually get rid of all the architecture related to PyInterpreterState (which is valuable from a code health perspective).
But that's a more complex and nuanced question that I'm not 100% certain of, while the idea of merging it for 3.9 seems like a glaringly obvious bad idea.
Yeah, I remember your position from previous conversations (and still appreciate your feedback). :)
I know you want folks to consider PEP 554 on its own merits, ignoring the GIL-splitting work, but let's be realistic: purely as a concurrency framework, there's at least a dozen more mature/featureful/compelling options in the stdlib and on PyPI, and as an isolation mechanism, subinterpreters have been around for >20 years and in that time they've found 3 users and no previous champions. Obviously the GIL stuff is the only reason PEP 554 might be worth accepting.
Saying it's "obviously" the "only" reason is a bit much. :) PEP 554 exposes existing functionality that hasn't been all that popular (until recently for some reason <wink>) mostly because it is old, was never publicized (until recently), and involved using the C-API. As soon as folks learn about it they want it, for various reasons including (relative) isolation and reduced resource usage in large-scale deployment scenarios. It becomes even more attractive if you say subinterpreters allow you to work around the GIL in a single process, but that isn't the only reason.
Or if PEP 554 is really a good idea on its own merits, purely as a new concurrency API, then why not build that concurrency API on top of multiprocessing and put it on PyPI and let real users try it out?
As I said, the aim of PEP 554 isn't to provide a full concurrency model, though it could facilitate something like CSP. FWIW, there are CSP libraries on PyPI already, but they are limited due to their reliance on threads or multiprocessing.
[snip]
Normally, when people reference this story they focus on the bikeshed, hence the term "bikeshedding". But for PEP 554, you're building a nuclear power plant :-).
:)
The whole conglomeration of a new concurrency API,
As noted, rather than a new concurrency API it provides the basic functionality could be useful to create such a new API. The distinction might be subtle, but it is significant and rests on the expectations set for users.
new subinterpreter support in the interpreter,
There is nothing new here. Subinterpreters have been around for over 20 years and almost nothing has changed with that functionality for many years.
GIL splitting,
The work related to a per-interpreter GIL is almost entirely things that are getting done anyway for other reasons. There are only a few things (like the GIL) that we would not have made per-interpreter anyway.
etc. is stupendously complex,
The project involves lots of little pieces, each supremely tractable. So if by "stupendously complex" you mean "stupendously tedious/boring" then I agree. :) It isn't something that requires a big brain so much as a willingness to stick with it.
and I feel like this means that each piece has gotten way less scrutiny than it would have it if it had to stand on its own.
I apologize if I haven't communicated the nature of this project clearly. There really aren't any significant architectural changes involved. Rather we're moving a lot of little things around in a uniform way. Probably the biggest thing is updating modules for PEPs 3121 and 489, and the several other PEPs (like 573) that have come about as part of that. As I've said, most of things I need done for my project are things others need done for other projects, which is a happy situation for all of us. :) In fact, I'm sure I have actually done relatively little of the work (in case that was a source of concern for you <wink> -- most of the code I've touched is new as part of the PEP 554 implementation). So most of this stuff already does "stand on its own" and has gotten just as much scrutiny as any other work we do all the time in CPython (e.g. PR reviews, BPO discussion).
This seems like a bad outcome, since you'd think that if something is more complex, it should get *more* scrutiny, not less. But it's very hard to effectively reason about the whole conglomeration and formulate feedback.
Do you still have concerns after my explanation above? -eric