
Antoine Pitrou <solipsis@pitrou.net> wrote:
On Wed, 9 May 2012 11:26:29 +0200 Stefan Krah <stefan@bytereef.org> wrote:
Antoine Pitrou <solipsis@pitrou.net> wrote:
_decimal is about 12% faster without threads, because the expensive thread local context can be disabled.
If you cached the last thread id along with the corresponding context, perhaps it could speed things up in most scenarios?
Nice. This reduces the speed difference to about 4%!
Note that you don't need the actual thread id, the Python thread state is sufficient: PyThreadState_GET should be a simply variable lookup in release builds.
I've tried both ways now and the speed gain is roughly the same. Perhaps the interpreter as a whole is slightly faster --without-threads? That would explain the remaining speed difference of 4%. Stefan Krah