A Sort Optimization Technique: decorate-sort-dedecorate

Joachim Durchholz jo at durchholz.org
Tue Aug 29 15:55:57 EDT 2006


Tim Peters schrieb:
> [Joachim Durchholz]
>>>> Wikipedia says it's going from 2NlogN to N. If a sort is massively
>>>> dominated by the comparison, that could give a speedup of up to 100%
>>>> (approximately - dropping the logN factor is almost irrelevant, what
>>>> counts is losing that factor of 2).
> 
> [Gabriel Genellina]
>>> In fact it's the other way - losing a factor of 2 is irrelevant,
>>> O(2N)=O(N). The logN factor is crucial here.
> 
> [Joachim Durchholz]
>> That's just a question of what you're interested in.
>>
>> If it's asymptotic behavior, then the O(logN) factor is a difference.
>>
>> If it's practical speed, a constant factor of 2 is far more relevant
>> than any O(logN) factor.
> 
> Nope.  Even if you're thinking of base 10 logarithms, log(N, 10) > 2
> for every N > 100.  Base 2 logarithms are actually most appropriate
> here, and log(N, 2) > 2 for every N > 4.  So even if the "2" made
> sense here (it doesn't -- see next paragraph), the log(N) term
> dominates it for even relatively tiny values of N.

Whether this argument is relevant depends on the constant factors 
associated with each term.
Roughly speaking, if the constant factor on the O(N) term is 100 and the 
constant factor on the O(logN) term is 1, then it's still irrelevant.

My point actually is this: big-Oh measures are fine for comparing 
algorithms in general, but when it comes to optimizing concrete 
implementations, its value greatly diminishes: you still have to 
investigate the constant factors and all the details that the big-Oh 
notation abstracts away.
 From that point of view, it's irrelevant whether some part of the 
algorithm contributes an O(1) or an O(logN) factor: the decision where 
to optimize is almost entirely dominated by the constant factors.

Regards,
Jo



More information about the Python-list mailing list