> Remember that py stone is a terrible benchmark.

I understand that. I was only using it as a spot check. I was surprised at how much slower my (threaded or unthreaded) matrix multiply was on nogil vs 3.9+. I went into it thinking I would see an improvement. The Performance section of Sam's design document starts:

As mentioned above, the no-GIL proof-of-concept interpreter is about 10% faster than CPython 3.9 (and 3.10) on the pyperformance benchmark suite.

so it didn't occur to me that I'd be looking at a slowdown, much less by as much as I'm seeing.

Maybe I've somehow stumbled on some instruction mix for which the nogil VM is much worse than the stock VM. For now, I prefer to think I'm just doing something stupid. It certainly wouldn't be the first time.


P.S. I suppose I should have cc'd Sam when I first replied to this thread, but I'm doing so now. I figured my mistake would reveal itself early on. Sam, here's my first post about my little "project." https://mail.python.org/archives/list/python-dev@python.org/message/WBLU6PZ2RDPEMG3ZYBWSAXUGXCJNFG4A/