# Balanced trees

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Mar 14 06:02:21 CET 2014

```On Thu, 13 Mar 2014 20:12:41 -0700, Dan Stromberg wrote:

> Sorting the same (slightly tweaked) data inside of a tight loop is
> rarely a good idea - despite the fact that the "sort" itself tends to be
> O(n) with Python's rather awesome builtin sorting algorithm.  This is
> because sorting inside a loop tends to yield O(n^2) algorithms, or even
> O((n^2)*logn).

I agree with you except for the word "rarely". It depends on how much
data you have, and in my experience most people think that N = 100 is a
lot :-)

Remember that Big Oh doesn't say anything about the constant of
proportionality. If you have one function that behaves like O(N) but with
a constant factor of 1000, and another function that behaves like
O(N**2) with a constant factor of 0.001, the second, theoretically
slower, function will actually be faster for N < one million.

So *in practice*, a solution that involves a theoretically worse Big Oh
function but a significantly lower constant factor may turn out to be
better for all the values you care about. Given the difference in speed
between Python's built-ins, and the code you can write yourself in pure
Python, it is normally better to push as much work as you can onto the
built-ins.

And of course, we should not make assumptions as to what is faster and
what is slower. After 15+ years, I'm still surprised about what ends up
being faster in Python.

> But if you're sorting once at the end of whatever other processing,
> sorting is great.  It's (still) O(nlogn), but it's got a terrific
> constant.

Exactly!

--
Steven

```