How to make this faster
steve+comp.lang.python at pearwood.info
Sat Jul 6 05:05:30 CEST 2013
On Fri, 05 Jul 2013 18:39:15 +0000, Helmut Jarausch wrote:
> On Fri, 05 Jul 2013 16:50:41 +0000, Steven D'Aprano wrote:
>> On Fri, 05 Jul 2013 16:07:03 +0000, Helmut Jarausch wrote:
>>> The solution above take 0.79 seconds (mean of 100 calls) while the
>>> following version take 1.05 seconds (mean of 100 calls):
>> 1) How are you timing the calls?
> I've put the work horse "Solve" into a loop executing 100 times. That's
> on an otherwise idle Linux machine.
That doesn't explain how you time it, only that you have a loop executing
100 times. Are you using time.time, or time.clock? (I trust you're not
measuring times by hand with a stop watch.)
I expect you're probably doing something like this:
start = time.time()
for i in range(100):
t = time.time() - start
print "time taken", t/100
In this case, you're getting times well over a second, so the overhead is
unlikely to be significant. E.g. the time it takes to create range(100)
will be insignificant compared to the time it takes to run solve.
Nevertheless, as a matter of good practice, I recommend that you use the
timeit module, as it's carefully written to minimize the overhead when
timing small code snippets where the overhead is *not* insignificant.
For longer running code, like this, you might also like this:
>> 2) Don't use the mean, that's the wrong statistic when you are
>> measuring something where the error is always one sided. You want the
>> minimum, not the mean.
> Here you assume time measuring itself is without error - I doubt that.
No, I don't. I may have glossed over all the sources of measurement
error, but they are all *positive* which is the important thing. The time
measured will never be *less* than the actual time taken. So regardless
of the source of measurement error, the way to minimize it is to only
look at the minimum timing, not the mean.
> If the timing version, which executes function "Solve" one hundred
> times, runs about 80-100 seconds without a significant variation, then
> taking the mean is mathematically correct.
If the best you can say is it takes "80-100 seconds", that's pretty
significant variation, of the order of 20%.
In this case, with times of the order of a second per loop, it may be
reasonable to say "in this specific case, the error is too small to care
about", or "I just don't care about the error, since it will be about the
same for different variations of my solve function". But in that case,
why bother looping 100 times?
> I can't take the minimum
> since I don't measure the time a single call takes.
Then perhaps you should.
More information about the Python-list