Le mer. 13 nov. 2019 à 14:28, Larry Hastings <larry@hastings.org> a écrit :I did exactly that in the Gilectomy prototype. Pulling it out of TLS was too slow,What do you mean? Getting tstate from a TLS was a performance bottleneck by itself? Reading a TLS variable seems to be quite efficient.
I'm pretty sure you understand the sentence "Pulling it out of
TLS was too slow". At the time CPython used the POSIX APIs for
accessing thread local storage, and I didn't know about and
therefore did not try this "__thread" GCC extension. I do
remember trying some other API that was purported to be
faster--maybe a GCC library function for faster TLS access?--but I
didn't get that to work either before I gave up on it out of
frustration.
Also, I dimly recall that I moved several things from globals into the ThreadState structure, and probably added one or two of my own. So nearly every function call was referencing ThreadState at one point or another. Passing it as a parameter was a definite win over calling the POSIX TLS APIs.
I also took the opportunity to pass my "reference count manager" data as a separate parameter, which again was per-thread and again was a major win at the time.
/arry