[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

April 29, 2020

      On Tue, Apr 21, 2020 at 10:24 AM Sebastian Berg
<sebastian@sipsolutions.net> wrote:
...
On Tue, 2020-04-21 at 16:21 +0200, Victor Stinner wrote:
...
I fail to follow your logic. When the asyncio PEP was approved, I
don't recall that suddenly the whole Python community started to
rewrite all projects to use coroutines everywhere. I tried hard to
replace eventlet with asyncio in OpenStack and I failed because such
migration was a very large project with dubious benefits (people
impacted by eventlet issues were the minority).
Sure, but this is very different. You can still use NumPy in a project
using asyncio. You are _not_ able to use NumPy in a project using
subinterpreters.
True.  Is that a short-term problem?  I don't know.  A long-term
problem?  Definitely.  So it will have to be addressed at some point.

The biggest concern here is what is the resulting burden on extension
authors and what can we do to help mitigate that.  The first step is
to understand what that burden might entail.
...
Right now, I have to say as soon as the first bug report asking for
this is opened and tells me: But see PEP 554 you should support it! I
would be tempted to put on the NumPy Roadmap/Vision that no current
core dev will put serious efforts into subinterpreters. Someone is
bound to be mad.
Yeah.  And I don't want to put folks in the position that they get
fussed at for something like this.  This isn't a ploy to force
projects like numpy to fix their subinterpreter support.

My (honest) question is, how many folks using subinterpreters are
going to want to use numpy (or module X) enough to get mad about it
before the extension supports subinterpreters?  What will user
expectations be when it comes to subinterpreters?

We will make the docs as clear as we can, but there are plenty of
users out there that will not pay enough attention to know that most
extension modules will not support subinterpreters at first.  Is there
anything we can do to mitigate this impact?  How much would it help if
the ImportError for incompatible modules give a clear (though
lengthier) explanation of the situation?
...
Basically, if someone wants it in NumPy, I personally may expect them
to be prepared to invest a year worth of good dev time [1]. Maybe that
is pessimistic, but your guess is as good as mine. At normal dev-pace
it will be at least a few years of incremental changes before NumPy
might be ready (how long did it take Python?)?
The PEP links to NumPy bugs, I am not sure that we ever fixed a single
one. Even if, the remaining ones are much larger and deeper. As of now,
the NumPy public API has to be changed to even start supporting
subinterpreters as far as I aware [2]. This is because right now we
sometimes need to grab the GIL (raise errors) in functions that are not
passed GIL state.
What do you expect to have to change?  It might not be as bad as you
think...or I suppose it could be. :)

Keep in mind that subinterpreter support means making sure all of the
module's global state is per-interpreter.  I'm hearing about things
like passing around GIL state and using the limited C-API.  None of
that should be a factor.
...
This all is not to say that this PEP itself doesn't seem harmless. But
the _expectation_ that subinterpreters should be first class citizens
will be a real and severe transition burden. And if it does not, the
current text of the PEP gives me, as someone naive about
subinterpreters, very few reasons why I should put in that effort or
reasons to make me believe that it actually is not as bad a transition
as it seems.
Yeah, the PEP is very light on useful information extension module
maintainers.  What information do you think would be most helpful?
...
Right now, I would simply refuse to spend time on it. But as Nathaniel
said, it may be worse if I did not refuse and in the end only a handful
of users get anything out of my work: The time is much better spend
elsewhere. And you, i.e. CPython will spend your "please fix your C-
extension" chips on subinterpreters. Maybe that is the only thing on
the agenda, but if it is not, it could push other things away.
Good point.
...
Reading the PEP, it is fuzzy on the promises (the most concrete I
remember is that it may be good for security relevant reasons), which
is fine, because the goal is "experimentation" more than use?
The PEP is definitely lacking clear details on how folks might use
subinterpreters (via the proposed module).  There are a variety of
reasons.

I originally wrote the PEP mostly as "let's expose existing
functionality more broadly", with the goal of getting it into folks'
hands sooner rather than later.  My focus was mostly on the API.  I
didn't see a strong need to convince anyone that the feature itself
was worth it (since it already existed).  In many ways the PEP is a
side effect of my efforts to achieve a good multi-core Python story
(via a per-interpreter GIL).  All the relevant parties in that effort
saw PEP 554 as worth it for that, so there wasn't much pressure to
elaborate.  At the same time, in the PEP I tried to avoid the GIL
connection because I personally think subinterpreters offer a
meaningful value already, even while they share the GIL, and that the
PEP should stand on its own merits.  I just didn't develop that aspect
very far.  On top of that, ultimately a PEP is a vehicle to aid a
BDFL-delegate in making a decision, so I usually write PEPs with a
specific audience.  Put all that together, along with the limited time
I have, and you get a lackluster explanation of the benefits from
subinterpreters.

At this point I'm torn on spending more time on fleshing that out.  I
want folks to get inspired, but at the same time I have to ration my
open-source time carefully.
...
So if its more about "experimentation", then I have to ask, whether:
1. The PEP can state that more obviously, it wants to be
provisionally/experimentally accept? So maybe it should even say that
that extension modules are not (really) encouraged to transition unless
they feel a significant portion of their users will gain.
That sounds fair.  Then extension authors can point concerned users to
the docs.  The docs would say something like "Support for
subinterpreters in extension modules should not be expected while this
module is provisional."  That isn't the only thing we should do to set
user expectations properly, but it would help.

(FWIW, for extension maintainers we would likely also have a link on
the "interpreters" module docs pointing to the
how-to-support-subinterpreters-in-an-extension-module page.)
...
2. The point about developing it outside of the Python standard lib
should be considered more seriously. I do not know if that can be done,
but C-API additions/changes/tweaks seem a bit orthogonal to the python
exposure? So maybe it actually is possible?
It is possible.  The low-level implementation is an extension module
that uses the public C-API exclusively.  Using the internal C-API
would be a little dishonest.  That said, the module is very closely
tied to the CPython runtime, so it makes sense to keep it in the
CPython repo.  Furthermore, I don't see much value in keeping it out
of the stdlib.  Finally, a few of us have plans for using
subinterpreters in CPython's test suite.  I suppose we could keep the
module's code and tests in the CPython repo, while releasing it
separately, but I don't see the point.
...
As far as I can tell, nobody can or _should_ expect subinterpreters to
actually run most general python code for many years.
Subinterpreters run all Python code right now.  I'm guessing by
"general python code" you are talking about the code folks are writing
plus their dependencies.  In that case, it's only with extension
modules that we run into a problem, and we still don't know with how
many of those it's a problem where it will take a lot of work.
However, I *am* convinced that there is a non-trivial amount of work
there and that it impacts large extension modules more than others.
The question is, what can we do to mitigate the amount of work there?
...
Yes, its a
chicken-and-egg problem, unless users start to use subinterpreters
successfully, C-extensions should probably not even worry to
transition.
This PEP wants to break the chicken-and-egg problem to have a start,
but as of now, as far as I can tell, it *must not* promise that it will
ever work out.
I see where you're coming from but think it's highly unlikely at this
point in CPython's life that subinterpreters (as a public feature)
will ever go away.  I'm definitely biased :) but it would be hard to
justify removing a feature that has been publicly available for most
of Python's existence.  We know of a small number of public projects
that already use subinterpreters through the C-API and expect there is
a not insignificant number of private projects that rely on the
feature.  (That doesn't even consider the projects that tried to use
subinterpreters but couldn't move past the limitations (which are
mostly due to lack of use).)
...
So, I cannot judge the sentiment or subinterpreters. But it may be good
to make it *painfully* clear what you expect from a project like NumPy
in the next few years. Alternatively, make it painfully clear that you
possibly even discourage us from spending time on it now, if its not
straight forward. Those using this module are on their own for many
years, probably even after success is proven.
Thanks for the feedback!
...
[1] As of now, the way I see it is that I could not even make NumPy
(and probably many C extensions) work, because I doubt that the limited
API has been exercised enough [2] and I am pretty sure it has holes.
Also the PEP about passing module state around to store globals
efficiently seems necessary, and is not in yet? (Again, trust: I have
to trust you that e.g. what you do to make argument parsing not have
overhead in argument clinic will be something that I can use for
similar purposes within NumPy)
The relevant API is on its way.  We'd be glad to pursue C-API
improvements to facilitate efficient support for subinterpreters in
extension modules.  FWIW, often such improvements provide other
benefits that are desirable on their own.
...
[2]  I hope that we will do (many) these changes for other reasons
within a year or so,
That's good to know.  What specific changes are you planning for?  Are
there documents or issue #s you could point us at?  This sort of thing
may be helpful to identify how we can assist from the CPython side.
...
but they go deep into code barely touched in a
decade. Realistically, even after the straight forward changes (such as
using the new PEPs for module initialization), these may take up an
additional few months of dev time (sure, get someone very good or does
nothing else, they can do it much quicker maybe).
So yes, from the perspective of a complex C-extension, this is probably
very comparable to the 2to3 change (it happened largely before my time
though).
Also good to know.  I'm hopeful that nothing will ever be as
disruptive as the 2/3 transition. :)  I can see what you mean on a
project-by-project basis though.
...
[3] E.g. I think I want an ExtensionMetaClass, a bit similar as an ABC,
but I would prefer to store the data in a true C-slot fashion. The
limited API cannot do MetaClasses correctly as far as I could tell and
IIRC is likely even a bit buggy.
Are ExtensionMetaClasses crazy? Maybe, but PySide does it too (and as
far as I can tell, they basically get away with it by a bit of hacking
and relying on Python implementation details.
That's an interesting idea.  How much would that help with
subinterpreter support?  Regardless, you should consider bringing this
up separately on the capi-sig mailing list.

Thanks again!

-eric

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

Eric Snow