Warning: the contents of this message may be dangerous for readers with heart conditions.
So today I looked at PyFloat_RichCompare. I had been scared initially because it was so complicated, I was worried I would miss some important error check or something if I special cased it. So I took the following approach: I replaced all statements of the form PyLong_Check(w) with 0, and all statements of the form PyFloat_Check(w) with 1. Because that's the whole point; we're cutting out type checks. I then cut out all blocks preceded by if (0). So it's definitely safe.
And it turns out that if you do that, PyFloat_RichCompare becomes ONE LINE (after we set op=Py_LT)!!! Just
return (PyFloat_AS_DOUBLE(v) < PyFloat_AS_DOUBLE(w)) == 0 ? Py_False : Py_True;
So I thought, wow, this will give some nice numbers! But I underestimated the power of this optimization. You have no idea. It's crazy.
Since we're dealing with floats, I used Tim Peter's benchmark: Lib/test/sortperf.py, just modifying one function:
#Old function Tim Peters wrote:
def doit_fast(L):
t0 = time.perf_counter()
L.fastsort()
t1 = time.perf_counter()
print("%6.2f" % (t1-t0), end=' ')
flush()
#My function:
def doit(L):
F = FastList(L)
f0 = time.perf_counter()
F.fastsort()
f1 = time.perf_counter()
F = FastList(L)
t0 = time.perf_counter()
F.sort()
t1 = time.perf_counter()
print("%6.2f%%" % (100*(1-(f1-f0)/(t1-t0))), end=' ')
flush()
So the benchmarking here is valid. I didn't write it. All I did was modify it to print percent improvement instead of sort time. The numbers below are percent improvement of my sort vs default sort (just clone my repo and run python sortperf.py to verify):
i 2**i *sort \sort /sort 3sort +sort %sort ~sort =sort !sort
15 32768 44.11% 54.69% 47.41% 57.61% 50.17% 75.24% 68.03% 65.16% 82.16%
16 65536 14.14% 75.38% 63.44% 56.56% 67.99% 66.19% 50.72% 61.55% 61.87%
17 131072 73.54% 60.52% 60.97% 61.63% 52.55% 49.84% 68.68% 84.12% 65.90%
18 262144 54.19% 55.34% 54.67% 54.13% 55.62% 52.88% 69.30% 74.86% 72.66%
19 524288 55.12% 53.32% 53.77% 54.27% 52.97% 53.53% 67.55% 76.60% 78.56%
20 1048576 55.05% 51.09% 60.05% 50.69% 62.98% 50.20% 66.24% 71.47% 61.40%
This is just insane. This is crazy. I didn't write this benchmark, OK, this is a valid benchmark. Tim Peters wrote it. And except for one trial, they all show more than 50% improvement. Some in the 70s and 80s. This is crazy. This is so cool. I just wanted to share this with you guys. I'll submit a patch to
bugs.python.org soon; I just have to write a special case comparison for tuples and then I'm basically done. This is so cool!!!!!!!!! 50% man!!!!! Crazy!!!!!
Elliot