I think the performance difference is because of different versions of NumPy.
Python 3.9 installs NumPy 1.21.3 by default for "pip install numpy". I've only built and packaged NumPy 1.19.4 for "nogil" Python. There are substantial performance differences between the two NumPy builds for this matmul script.With NumPy 1.19.4, I get practically the same results for both Python 3.9.2 and "nogil" Python for "time python3 matmul.py 0 100000".I'll update the version of NumPy for "nogil" Python if I have some time this week.Best,SamOn Sun, Oct 31, 2021 at 5:46 PM Skip Montanaro <email@example.com> wrote:> Remember that py stone is a terrible benchmark.
I understand that. I was only using it as a spot check. I was surprised at how much slower my (threaded or unthreaded) matrix multiply was on nogil vs 3.9+. I went into it thinking I would see an improvement. The Performance section of Sam's design document starts:As mentioned above, the no-GIL proof-of-concept interpreter is about 10% faster than CPython 3.9 (and 3.10) on the pyperformance benchmark suite.so it didn't occur to me that I'd be looking at a slowdown, much less by as much as I'm seeing.Maybe I've somehow stumbled on some instruction mix for which the nogil VM is much worse than the stock VM. For now, I prefer to think I'm just doing something stupid. It certainly wouldn't be the first time.SkipP.S. I suppose I should have cc'd Sam when I first replied to this thread, but I'm doing so now. I figured my mistake would reveal itself early on. Sam, here's my first post about my little "project." https://firstname.lastname@example.org/message/WBLU6PZ2RDPEMG3ZYBWSAXUGXCJNFG4A/