Big-O notation (was: Re: Python and Schools)
Mike C. Fletcher
mcfletch at rogers.com
Tue Apr 15 16:32:34 EDT 2003
Greg Brondo wrote:
...
>
> I understand why the code is bad (copying the string each time in
> memory as it grows). However, I've never understood the notation's
> O(N) and O(N2)?
...
Here's a crack at explaining it:
Characterisation of an algorithm's (as distinct from an
implementation's) performance in relation to the size of the input
set. Normally of interest in determining the
scalability/suitability of a particular algorithm. The static (e.g.
setup) and proportional (e.g. implementation) factors in the
algorithm are factored out under the assumption that they are
swamped by the relative effects of the size of the input set as that
goes to infinity.
Common examples of big-O performance types:
* O(1) -- time is linear, that is, it takes the same amount of time
to do something regardless of how large the input set is.
* O(log N) -- time grows as the log of N, (for those who don't
remember high-school math, this means that it grows very slowly
compared to N).
* O(N) -- time grows linearly as the size of the set grows (e.g.
process every item in the set in turn)
* O(N log N) -- see O(N) and O(log N), basically, you do an amount
of work tied to the size of the whole set for every stage in an O(
log N ) operation (this is comparatively rarely discussed in my
experience).
* O(N**2) O(N**3) etceteras -- time grows as an exponent (e.g.
2,3,3.4,10) of the size of the set, so you get geometric explosion
of running time as set size increases. These algorithms tend to
fail on even small N values.
You can have all sorts of fun learning how to combine and re-factor the
equations to get a final big-O notation, but for the most part,
practical coders can get away with recognising and avoiding O(N**X)
where X > 1 situations. Recognising the underlying patterns in O(log N)
(binary searches and the like), and O(1)-ish situations is useful as
well, as they provide very good general scalability. The calculus of
big-O notations is just a convenient shorthand for people doing rigorous
analysis AFAICS.
Keep in mind that the setup and constant factors of the algorithm are
factored out of the equations. Something which is O(1) may be slower
than O(N**3) for all practical values of N if the setup time for the
O(1) algorithm approaches the age of the universe. Similarly, two
algorithms which are O(N), but with vastly different constant factors
(implementations) may have dramatically different running times.
HTH,
Mike
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/
More information about the Python-list
mailing list