# Help with an 8th grade science project

Dave Angel davea at davea.name
Thu Nov 20 21:13:07 CET 2014

```dave em <daveandem2000 at gmail.com> Wrote in message:
> Hello,
>
> I am the adult advisor (aka father) to an 8th grader who is doing a science project that will compare / benchmark CPU performance between an AMD 64 Phenom II running Ubuntu 14.04 and a Raspberry Pi 700MHz ARM.
>
> Basic procedure:
> -  Run a Python script on both computers that that stresses the CPU and measure
> --  Time to complete the calculation
> -- Max CPU during the calculation
> -- We have chosen to do factorials and compare performance by running calculations by order of magnitude.  Our hypothesis is that we will begin to see a wider performance gap between the two computers as the factorials increase in order of magnitude.
>
> Status:
> -  We have a working program.  Pseudo code follows:
>
> import linux_metrics
> from linux_metrics import cpu_stat
> import time
>
> print 'Welcome to the stress test'
> number = raw_input("Enter the number to compute the factorial:")
>
> ## function to calculate CPU usage percentage
> def CPU_Percent():
>     cpu_pcts = cpu_stat.cpu_percents(.25)
>     print 'cpu utilization: %.2f%%' % (100 - cpu_pcts['idle'])
>     write cpu utilization to a csv file with g.write
>
> ## function to compute factorial of a given number
> def factorial(n):
>     num = 1
>     while n >= 1:
>         num = num * n
>         CPU_Percent()  ****This is the function call irt Q 1 below ****
>         n = n - 1
>     return num
>
> # Main program
> Record start time by using time.time()
> Call function to compute the factorial.
> Record finish time by using time.time()
> write time to compute to a file f.write(totalEndTime - totalStartTime)
> print ("Execution time = ", totalEndTime - totalStartTime)
>
>
> Questions:
> 1.  In the factorial() function we call the CPU_Percent() function and write the CPU utilization value to a file.
> -  Is this a correct value or will the CPU utilization below lower because the factorial() function made its calculation and is now just checking the CPU utilization?

I'm not familiar with that package; I just took a quick look at
pypi. So I'd have to guess. But since your timing is so huge, I'd
guess that you're measuring utilization during a time period that
your factorial calculation is paused. In other words you're
measuring cpu utilization for the other processes in your
system.

Probably someone else will correct me, but I'd guess you need to
measure utilization with a separate process.

> -  If we are not getting the true max CPU utilization, can someone offer a design change to accomplish this?
>
> 2.  What unit does time.time() use?  An example for calculating the factorial of 10 is our program gives:
>   Execution time = ', 1.5703258514404297  I presume this is telling us it took 1.57 seconds to complete the calculation?

It does indeed give results in seconds,  but that value is
ridiculous. Calculating factorial of 10 takes about 70
microseconds on this laptop.  And doing it for 10,000 (which
gives a very large result) takes about  a tenth of a second.
Including printing it, which takes longer than calculating
it.

Benchmarking can be extremely tricky,  and I assume you're not
permitted to use the timeit module. But at the very
least:

Measure an empty loop and compare it to the real loop. If they
both measure similar, then you're mostly measuring loop overhead.

Watch out for doing i/o during the timed part of the test; you may
be mostly measuring console time or file time, and not your
algorithm. Do your i/o after the ending call to time.time.

If you get times in the microsecond or millisecond range,  put the
whole mess in a loop so you can do a sanity check with your wrist
watch.

Check each systems to make sure time.time works well. Read the
docs, but do your own tests. Some systems only give you integer
seconds.

some measurements to see how to minimize it. I'd guess that range
(or xrange, since you're apparently using Python 2.x) will be
faster than while with increment.

If you're comparing two entirely different processors, make sure
you're using exactly the same version of Python. 2.75 on one
system probably should not be compared with 2.62, or even with
2.74

Don't forget the effects of other processes, and of disk caching.
You can orobably minimize them by a fresh boot, and by flushing.

Watch out for memory usage.  You can calculate the factorial of
one hundred thousand in a few seconds.  But it's some 450
thousand digits long, and takes quite a bit of memory.

The math module has a factorial function in it. You could use it