
On AMD64 Linux, the location of the thread local data seems to be stored in the GS CPU register[1]. It seems likely other platforms and other operating systems could do something similar. Passing threadstate as an explicit argument could be either faster or slower depending on how often you use it. If you use threadstate often, passing it explicitly (which likely uses a CPU register) could be a win. If you use it rarely, that CPU register would be better utilized for passing function arguments you actually use. Doing some experiments with optimized (i.e. using platform specific) TLS would seem a useful step before undertaking a major refactoring. Explicit passing could be a lot of code churn for no practical gain. 1. https://stackoverflow.com/questions/6611346/how-are-the-fs-gs-registers-used...