[Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
Chris Barker
chris.barker at noaa.gov
Wed Mar 8 18:30:50 EST 2017
On Wed, Mar 8, 2017 at 2:58 PM, Elliot Gorokhovsky <
elliot.gorokhovsky at gmail.com> wrote:
>
> The whole point of my patch is that we do O(nlogn) compares, but only have
> O(n) elements, so it's much cheaper to do all the type checks in advance,
>
> I mean, practically speaking, the advance check is basically free. The
> compare-time checks, in sum, are not, both asymptotically and practically.
>
hmm -- I know folks like to say that "the constants don't matter", but it
seems they do in this case:
without pre-checking:
O(nlogn)
with pre-checking If homogenous:
O(n) + O(nlogn)
so the pre-checking only adds if you ignore the constants..
But I'm assuming (particularly with locality and branch prediction and all
that included) the constant to type-check is much smaller than the constant
to compare two unknown types, so:
TC*n + KC*n*logn
vs
UC*n*logn
where:
TC -- Constant to type check
KC -- Constant known compare
UC -- Constant unknown type check
So if UC > TC/logn + KC
Then this optimization make sense.
If UC >KC and UC >>> TC, then this all works out.
But if n is HUGE, it may end up being slower (maybe more so with cache
locality???)
Which is why you need to profile all this carefully.
So far Elliott's experiments seem to show it works out well.
Which doesn't surprise me for built-ins like float and int that have a
native compare, but apparently also for more complex types? How about
custom classes?
-CHB
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170308/5e4410e6/attachment-0001.html>
More information about the Python-ideas
mailing list