[Python-ideas] INSANE FLOAT PERFORMANCE!!!
Tim Peters
tim.peters at gmail.com
Wed Oct 12 11:19:05 EDT 2016
[Paul Moore]
> My understanding is that the code does a per-check that all the
> elements of the list are the same type (float, for example). This is a
> relatively quick test (O(n) pointer comparisons). If everything *is* a
> float, then an optimised comparison routine that skips all the type
> checks and goes straight to a float comparison (single machine op).
That matches my understanding.
> Because there are more than O(n) comparisons done in a typical sort,
> this is a win.
If the types are in fact all the same, it should be a win even for
n==2 (at n < 2 no comparisons are done; at n==2 exactly 1 comparison
is done): one pointer compare + go-straight-to-C-float-"x<y"-later
(proposed) should be cheaper than the multiple layers of conditionals
and function calls currently required to dynamically deduce that
C-float-"x<y" is all that's really needed. So "more than O(n)"
comparisons sweetens the deal, when it applies, but even in the case
of a sorted or monotonically decreasing list (which require exactly
n-1 comparisons now) it should win.
Where it should predictably lose is when the types aren't all the
same. Then between 1 and n-2 pointer compares are wasted to determine
that. We still need timings to quantify the damage in such cases, but
I don't _expect_ it to be significant.
> And because the type checks needed in rich comparison
And layers of function calls.
> are much more expensive than a pointer check, it's a *big* win.
Bingo :-)
> What I'm *not* quite clear on is why Python 3's change to reject
> comparisons between unrelated types makes this optimisation possible.
It doesn't. It would also apply in Python 2. I simply expect the
optimization will pay off more frequently in Python 3 code. For
example, in Python 2 I used to create lists with objects of wildly
mixed types, and sort them merely to bring objects of the same type
next to each other. Things "like that" don't work at all in Python 3.
> Surely you have to check either way? It's not that it's a particularly
> important question - if the optimisation works, it's not that big a
> deal what triggered the insight. It's just that I'm not sure if
> there's some other point that I've not properly understood.
Well, either your understanding is fine, or we're both confused :-)
More information about the Python-ideas
mailing list