On Fri, Apr 17, 2020 at 11:50 AM Eric Snow firstname.lastname@example.org wrote:
Many folks have conflated PEP 554 with having a per-interpreter GIL. In fact, I was careful to avoid any mention of parallelism or the GIL in the PEP. Nonetheless some are expecting that when PEP 554 lands we will reach multi-core nirvana.
While PEP 554 might be accepted and the implementation ready in time for 3.9, the separate effort toward a per-interpreter GIL is unlikely to be sufficiently done in time. That will likely happen in the next couple months (for 3.10).
So...would it be sufficiently problematic for users if we land PEP 554 in 3.9 without per-interpreter GIL?
Here are the options as I see them (if the PEP is accepted in time for 3.9):
- merge PEP 554 into 3.9 even if per-interpreter GIL doesn't get into
3.9 (they get parallelism for free in 3.10) 2. like 1, but mark the module as provisional until per-interpreter GIL lands 3. do not merge PEP 554 until per-interpreter GIL is merged 4. like 3, but publish a 3.9-only module to PyPI in the meantime
I think some perspective might be useful here :-).
The last time we merged a new concurrency model in the stdlib, it was asyncio.
In that case, the process went something like:
- We started with two extremely mature libraries (Twisted + Tornado) with long histories of real-world use - The asyncio designers (esp. Guido) did a very extensive analysis of these libraries' design choices, spoke to the maintainers about what they'd learned from hard experience, etc. - Asyncio was initially shipped outside the stdlib to allow for testing and experimentation, and at this stage it was used to build non-trivial projects (e.g. the aiohttp project's first commits use tulip, not asyncio) - When it was eventually added to the stdlib, it was still marked provisional for multiple python releases, and underwent substantial and disruptive changes during this time - Even today, the limitations imposed by the stdlib release cycle still add substantial difficulty to maintaining asyncio
OTOH, AFAICT the new concurrency model in PEP 554 has never actually been used, and it isn't even clear whether it's useful at all. Designing useful concurrency models is *stupidly* hard. And on top of that, it requires major reworks of the interpreter internals + disrupts the existing C extension module ecosystem -- which is very different from asyncio, where folks who didn't use it could just ignore it.
So to me, it's kind of shocking that you'd even bring up the possibility of merging PEP 554 as-is, without even a provisional marker. And if it's possible for it to live on PyPI, then why would we even consider putting it into the stdlib? Personally, I'm still leaning towards thinking that the whole subinterpreter project is fundamentally flawed, and that on net we'd be better off removing support for them entirely. But that's a more complex and nuanced question that I'm not 100% certain of, while the idea of merging it for 3.9 seems like a glaringly obvious bad idea.
I know you want folks to consider PEP 554 on its own merits, ignoring the GIL-splitting work, but let's be realistic: purely as a concurrency framework, there's at least a dozen more mature/featureful/compelling options in the stdlib and on PyPI, and as an isolation mechanism, subinterpreters have been around for >20 years and in that time they've found 3 users and no previous champions. Obviously the GIL stuff is the only reason PEP 554 might be worth accepting. Or if PEP 554 is really a good idea on its own merits, purely as a new concurrency API, then why not build that concurrency API on top of multiprocessing and put it on PyPI and let real users try it out?
One more thought. Quoting from Poul Henning-Kemp's famous email at bikeshed.org:
Parkinson shows how you can go in to the board of directors and
get approval for building a multi-million or even billion dollar atomic power plant, but if you want to build a bike shed you will be tangled up in endless discussions.
Parkinson explains that this is because an atomic plant is so vast,
so expensive and so complicated that people cannot grasp it, and rather than try, they fall back on the assumption that somebody else checked all the details before it got this far. Richard P. Feynmann gives a couple of interesting, and very much to the point, examples relating to Los Alamos in his books.
A bike shed on the other hand. Anyone can build one of those over
a weekend, and still have time to watch the game on TV. So no matter how well prepared, no matter how reasonable you are with your proposal, somebody will seize the chance to show that he is doing his job, that he is paying attention, that he is *here*.
Normally, when people reference this story they focus on the bikeshed, hence the term "bikeshedding". But for PEP 554, you're building a nuclear power plant :-). The whole conglomeration of a new concurrency API, new subinterpreter support in the interpreter, GIL splitting, etc. is stupendously complex, and I feel like this means that each piece has gotten way less scrutiny than it would have it if it had to stand on its own. This seems like a bad outcome, since you'd think that if something is more complex, it should get *more* scrutiny, not less. But it's very hard to effectively reason about the whole conglomeration and formulate feedback.