A Sort Optimization Technique: decorate-sort-dedecorate
Joachim Durchholz
jo at durchholz.org
Tue Aug 29 15:55:57 EDT 2006
Tim Peters schrieb:
> [Joachim Durchholz]
>>>> Wikipedia says it's going from 2NlogN to N. If a sort is massively
>>>> dominated by the comparison, that could give a speedup of up to 100%
>>>> (approximately - dropping the logN factor is almost irrelevant, what
>>>> counts is losing that factor of 2).
>
> [Gabriel Genellina]
>>> In fact it's the other way - losing a factor of 2 is irrelevant,
>>> O(2N)=O(N). The logN factor is crucial here.
>
> [Joachim Durchholz]
>> That's just a question of what you're interested in.
>>
>> If it's asymptotic behavior, then the O(logN) factor is a difference.
>>
>> If it's practical speed, a constant factor of 2 is far more relevant
>> than any O(logN) factor.
>
> Nope. Even if you're thinking of base 10 logarithms, log(N, 10) > 2
> for every N > 100. Base 2 logarithms are actually most appropriate
> here, and log(N, 2) > 2 for every N > 4. So even if the "2" made
> sense here (it doesn't -- see next paragraph), the log(N) term
> dominates it for even relatively tiny values of N.
Whether this argument is relevant depends on the constant factors
associated with each term.
Roughly speaking, if the constant factor on the O(N) term is 100 and the
constant factor on the O(logN) term is 1, then it's still irrelevant.
My point actually is this: big-Oh measures are fine for comparing
algorithms in general, but when it comes to optimizing concrete
implementations, its value greatly diminishes: you still have to
investigate the constant factors and all the details that the big-Oh
notation abstracts away.
From that point of view, it's irrelevant whether some part of the
algorithm contributes an O(1) or an O(logN) factor: the decision where
to optimize is almost entirely dominated by the constant factors.
Regards,
Jo
More information about the Python-list
mailing list