[Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Wed Mar 8 18:30:50 EST 2017

On Wed, Mar 8, 2017 at 2:58 PM, Elliot Gorokhovsky <
elliot.gorokhovsky at gmail.com> wrote:

>
> The whole point of my patch is that we do O(nlogn) compares, but only have
> O(n) elements, so it's much cheaper to do all the type checks in advance,
>

>  I mean, practically speaking, the advance check is basically free. The
> compare-time checks, in sum, are not, both asymptotically and practically.
>

hmm -- I know folks like to say that "the constants don't matter", but it
seems they do in this case:

without pre-checking:

O(nlogn)

with pre-checking If homogenous:

O(n) + O(nlogn)

so the pre-checking only adds if you ignore the constants..

But I'm assuming (particularly with locality and branch prediction and all
that included) the constant to type-check is much smaller than the constant
to compare two unknown types, so:

TC*n + KC*n*logn

vs

UC*n*logn

where:
TC -- Constant to type check
KC -- Constant known compare
UC -- Constant unknown type check

So if UC > TC/logn + KC

Then this optimization make sense.

If UC >KC and UC >>> TC, then this all works out.

But if n is HUGE, it may end up being slower (maybe more so with cache
locality???)

Which is why you need to profile all this carefully.

So far Elliott's experiments seem to show it works out well.

Which doesn't surprise me for built-ins like float and int that have a
native compare, but apparently also for more complex types? How about
custom classes?

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170308/5e4410e6/attachment-0001.html>