Big-O notation

Wed Apr 16 07:35:24 EDT 2003

In article <slrnb9q90a.mhg.news at datatailors.com>,
Peter van Kampen <news at datatailors.xs4all.nl> wrote:
>In article <mailman.1050457621.2105.python-list at python.org>, Mike C.
>Fletcher wrote:
>> Andrew Koenig wrote:
>>
>> Okay, so now we have "The Practical Coder's Guide to big-O Notation" :)
>> .  We rock ;) ,
>> Mike
>
>You do, you know...c.l.p really does.
>
>I think I almost 'get it'. Except who or what decides what Big-O a
>certain algorithm has. Is that by 'hunch' or by running benchmarks or
>are there still other ways to recognize 'Big-O's (except for the
>formal computations that I understand are not very common)?
>
>for example
>
>bigO_N(i):
>    for n in xrange(i):
>        print n
>
>Would be O(N) right?

Yes.  A slightly more concrete example:

def bigO_N (a_list):
    for a in a_list:
        print a

where run-time depends pretty much on the len (a_list).

>bigO_N2(i,j):
>    for n in xrange(i):
>        for o in xrange(j):
>            print i,j
>
>This would be O(N**2)? Actually O(i*j) I suppose but if i==j N**2
>applies.

Again:

def bigO_N2 (a_list):
    for n in a_list:
        for m in a_list:
            print n, m

I guess two different lists might be used, just as different
i and j values were used above.  The run-time depends on the
product of two sizes, so it's called order N-squared.  The
O-notation is intentionally sloppy .. scarcely better that a
hand-wave, but it's a useful hand-wave.

>But these are not really 'algorithms'.

   Yes they are.

>                                       Most algorithms are a lot more
>complex even when reduced to it's essentials.

   Most algorigthms have a lot of processing details that
can distract you from regarding the bare-bones structure.
The O notation is only about that structure:
 - hit one element of the data, ignoring the rest
 - a pass over the data
 - a pass over the data for each data element
 - a pass over the data for each permutation of elements
 - something else

>                                              I understand why
>quicksort is better than bubblesort but I don't see an obvious way to
>label quicksort O(??).

   Divide-and-Conquer algorithms tend to be O(n log n).
Quicksort works by dividing the data set up into a high set
and a low set, and then quicksorting each subset separately.
Making a pass over the set to divide it into high and low is
O(n) .. just like the O(n) examples above in this post.  The
number of times you will divide the set in two is roughly
log2(n).  So performing log(n) copies of a process that's
O(n) makes for a process that itself is O(n log n).

        Regards.        Mel.