
Le jeu. 14 nov. 2019 à 04:55, Larry Hastings <larry@hastings.org> a écrit :
I'm pretty sure you understand the sentence "Pulling it out of TLS was too slow". At the time CPython used the POSIX APIs for accessing thread local storage, and I didn't know about and therefore did not try this "__thread" GCC extension. I do remember trying some other API that was purported to be faster--maybe a GCC library function for faster TLS access?--but I didn't get that to work either before I gave up on it out of frustration.
I asked for confirmation, since I was surprised. But when I looked at assembly with my friend, we played with __thread not with pthread_getspecific(). So thanks for confirming that "getting tstate" can be a performance bottleneck: that's a very good reason to pass it explicitly.
I also took the opportunity to pass my "reference count manager" data as a separate parameter, which again was per-thread and again was a major win at the time.
Another approach would be to pass a "PyContext*" pointer which contains tstate, but also additional fields. But I chose to state with a direct "PyThreadState* tstate" to avoid one indirection to every tstate access. Currently, tstate seems to be enough for the current code base. Victor -- Night gathers, and now my watch begins. It shall not end until my death.