data:image/s3,"s3://crabby-images/90304/903046437766f403ab1dffb9b01cc036820247a4" alt=""
On 16/03/2020 3:04 pm, Victor Stinner wrote:
Hi,
Changes on this scale merit a PEP and proper discussion, rather than being added piecemeal without proper review.
Last November, I asked explicitly on python-dev if we should "Pass the Python thread state to internal C functions": https://mail.python.org/archives/list/python-dev@python.org/thread/PQBGECVGV...
In short, the answer is yes.
I said "no" then and gave reasons. AFAICT no one has faulted my reasoning. Let me reiterate why using a thread-local variable is better than passing the thread state down the C stack. 1. Using a thread-local variable for the thread state requires much smaller changes to the code base. 2. Using a thread-local variable is less error prone. When passing tstate as a parameter, what happens if the tstate argument is from a different thread or is NULL? Are you adding checks for those cases? What are the performance implications of adding those checks? 3. Using a thread-local variable is likely to be a little bit faster. Passing an argument down the stack increases register pressure and spills. Accessing a thread-local is slower at the point of access, but the cost is incurred only when it is needed, so is cheaper overall.
There is no PEP but scatted documents. I wrote a short article to elaborate the context of this work: https://vstinner.github.io/cpython-pass-tstate.html
One motivation is to ease the implementation of subinterpreters (PEP 554). But PEP 554 describes more than public API than the implementation.
I don't see how this eases the implementation of subinterpreters. Surely it makes it harder by causing merge conflicts.
--
In the meanwhile, I modified "small integer singletons" to make them "per-interpreter". So tstate->interp is now used to get small integers in longobject.c.
I also opened a discussion on other singletons (None, True, False, ...): https://bugs.python.org/issue39511
That's great, but I don't see the relevance.
The long-term goal is to be able to run multiple isolated interpreters in parallel.
An admirable goal, IMO. But how is it to be done? That is the question.
Le lun. 16 mars 2020 à 15:16, Mark Shannon <mark@hotpy.org> a écrit :
There seems to be a proliferation of `PyThreadState *tstate` arguments being added to API and internal functions.
So far, tstate should only be passed to *internal* C functions. I don't think that the public C API has been modified to pass tstate.
These changes are listed under https://bugs.python.org/issue38644.
There was also https://bugs.python.org/issue36710
I think that these changes are misguided. The desired results can be achieved more reliably and more simply in other ways.
Would you mind to elaborate?
Using thread-local variables changes the API less, is more reliable and maybe faster.
The changes add bulk to the C-API and may hurt performance.
Did you notice that in benchmarks? I would be curious to see the overhead.
These changes are also causing a lot of churn and merge conflicts (for me at least).
Sorry about that :-/ A lot of Python internals should be modified to implement subinterpreters.
I don't think they *should*. When they *must*, then do that. Changes should only be made if necessary for correctness or for a significant improvement of performance. Cheers, Mark.
Victor