[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

April 17, 2020

      On Fri, Apr 17, 2020 at 11:50 AM Eric Snow <ericsnowcurrently@gmail.com> wrote:
...
Dilemma
============
Many folks have conflated PEP 554 with having a per-interpreter GIL.
In fact, I was careful to avoid any mention of parallelism or the GIL
in the PEP.  Nonetheless some are expecting that when PEP 554 lands we
will reach multi-core nirvana.
While PEP 554 might be accepted and the implementation ready in time
for 3.9, the separate effort toward a per-interpreter GIL is unlikely
to be sufficiently done in time.  That will likely happen in the next
couple months (for 3.10).
So...would it be sufficiently problematic for users if we land PEP 554
in 3.9 without per-interpreter GIL?
Options
============
Here are the options as I see them (if the PEP is accepted in time for 3.9):
1. merge PEP 554 into 3.9 even if per-interpreter GIL doesn't get into
3.9 (they get parallelism for free in 3.10)
2. like 1, but mark the module as provisional until per-interpreter GIL lands
3. do not merge PEP 554 until per-interpreter GIL is merged
4. like 3, but publish a 3.9-only module to PyPI in the meantime
I think some perspective might be useful here :-).

The last time we merged a new concurrency model in the stdlib, it was asyncio.

In that case, the process went something like:

- We started with two extremely mature libraries (Twisted + Tornado)
with long histories of real-world use
- The asyncio designers (esp. Guido) did a very extensive analysis of
these libraries' design choices, spoke to the maintainers about what
they'd learned from hard experience, etc.
- Asyncio was initially shipped outside the stdlib to allow for
testing and experimentation, and at this stage it was used to build
non-trivial projects (e.g. the aiohttp project's first commits use
tulip, not asyncio)
- When it was eventually added to the stdlib, it was still marked
provisional for multiple python releases, and underwent substantial
and disruptive changes during this time
- Even today, the limitations imposed by the stdlib release cycle
still add substantial difficulty to maintaining asyncio

OTOH, AFAICT the new concurrency model in PEP 554 has never actually
been used, and it isn't even clear whether it's useful at all.
Designing useful concurrency models is *stupidly* hard. And on top of
that, it requires major reworks of the interpreter internals +
disrupts the existing C extension module ecosystem -- which is very
different from asyncio, where folks who didn't use it could just
ignore it.

So to me, it's kind of shocking that you'd even bring up the
possibility of merging PEP 554 as-is, without even a provisional
marker. And if it's possible for it to live on PyPI, then why would we
even consider putting it into the stdlib? Personally, I'm still
leaning towards thinking that the whole subinterpreter project is
fundamentally flawed, and that on net we'd be better off removing
support for them entirely. But that's a more complex and nuanced
question that I'm not 100% certain of, while the idea of merging it
for 3.9 seems like a glaringly obvious bad idea.

I know you want folks to consider PEP 554 on its own merits, ignoring
the GIL-splitting work, but let's be realistic: purely as a
concurrency framework, there's at least a dozen more
mature/featureful/compelling options in the stdlib and on PyPI, and as
an isolation mechanism, subinterpreters have been around for >20 years
and in that time they've found 3 users and no previous champions.
Obviously the GIL stuff is the only reason PEP 554 might be worth
accepting. Or if PEP 554 is really a good idea on its own merits,
purely as a new concurrency API, then why not build that concurrency
API on top of multiprocessing and put it on PyPI and let real users
try it out?

One more thought. Quoting from Poul Henning-Kemp's famous email at bikeshed.org:
...
Parkinson shows how you can go in to the board of directors and
get approval for building a multi-million or even billion dollar
atomic power plant, but if you want to build a bike shed you will
be tangled up in endless discussions.
Parkinson explains that this is because an atomic plant is so vast,
so expensive and so complicated that people cannot grasp it, and
rather than try, they fall back on the assumption that somebody
else checked all the details before it got this far.   Richard P.
Feynmann gives a couple of interesting, and very much to the point,
examples relating to Los Alamos in his books.
A bike shed on the other hand.  Anyone can build one of those over
a weekend, and still have time to watch the game on TV.  So no
matter how well prepared, no matter how reasonable you are with
your proposal, somebody will seize the chance to show that he is
doing his job, that he is paying attention, that he is *here*.
Normally, when people reference this story they focus on the bikeshed,
hence the term "bikeshedding". But for PEP 554, you're building a
nuclear power plant :-). The whole conglomeration of a new concurrency
API, new subinterpreter support in the interpreter, GIL splitting,
etc. is stupendously complex, and I feel like this means that each
piece has gotten way less scrutiny than it would have it if it had to
stand on its own. This seems like a bad outcome, since you'd think
that if something is more complex, it should get *more* scrutiny, not
less. But it's very hard to effectively reason about the whole
conglomeration and formulate feedback.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

Nathaniel Smith