How can I make this piece of code even faster?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sun Jul 21 01:11:51 EDT 2013
On Sat, 20 Jul 2013 13:22:03 -0700, pablobarhamalzas asked:
"How can I make this piece of code even faster?"
- Use a faster computer.
- Put in more memory.
- If using Unix or Linux, decrease the "nice" priority of the process.
I mention these because sometimes people forget that if you have a choice
between "spend 10 hours at $70 per hour to optimize code", and "spend
$200 to put more memory in", putting more memory in may be more cost
effective.
Other than that, what you describe sounds like it could be a good
candidate for PyPy to speed the code up, although PyPy is still (mostly)
Python 2. You could take this question to the pypy mailing list and ask
there.
http://mail.python.org/mailman/listinfo/pypy-dev
You also might like to try Cython or Numba.
As far as pure-Python optimizations, once you have a decent algorithm,
there's probably not a lot of room for major speed ups. But a couple of
thoughts and a possible optimized version come to mind...
1) In general, it is better/faster to iterate over lists directly, than
indirectly by index number:
for item in sequence:
process(item)
rather than:
for i in range(len(sequence)):
item = sequence[i]
process(item)
If you need both the index and the value:
for i, item in enumerate(sequence):
print(i, process(item))
In your specific case, if I have understood your code's logic, you can
just iterate directly over the appropriate lists, once each.
2) You perform an exponentiation using math.e**(-temp). You will probably
find that math.exp(-temp) is both faster and more accurate.
3) If you need to add numbers, it is better to call sum() or math.fsum()
than add them by hand. sum() may be a tiny bit faster, or maybe not, but
fsum() is more accurate for floats.
See below for my suggestion on an optimized version.
> Ok, I'm working on a predator/prey simulation, which evolve using
> genetic algorithms. At the moment, they use a quite simple feed-forward
> neural network, which can change size over time. Each brain "tick" is
> performed by the following function (inside the Brain class):
>
> def tick(self):
> input_num = self.input_num
> hidden_num = self.hidden_num
> output_num = self.output_num
>
> hidden = [0]*hidden_num
> output = [0]*output_num
>
> inputs = self.input
> h_weight = self.h_weight
> o_weight = self.o_weight
>
> e = math.e
>
> count = -1
> for x in range(hidden_num):
> temp = 0
> for y in range(input_num):
> count += 1
> temp += inputs[y] * h_weight[count]
> hidden[x] = 1/(1+e**(-temp))
>
> count = -1
> for x in range(output_num):
> temp = 0
> for y in range(hidden_num):
> count += 1
> temp += hidden[y] * o_weight[count]
> output[x] = 1/(1+e**(-temp))
>
> self.output = output
>
> The function is actually quite fast (~0.040 seconds per 200 calls, using
> 10 input, 20 hidden and 3 output neurons), and used to be much slower
> untill I fiddled about with it a bit to make it faster. However, it is
> still somewhat slow for what I need it.
>
> My question to you is if you an see any obvious (or not so obvious) way
> of making this faster. I've heard about numpy and have been reading
> about it, but I really can't see how it could be implemented here.
Here's my suggestion:
def tick(self):
exp = math.exp
sum = math.fsum # more accurate than builtin sum
# This assumes that both inputs and h_weight have exactly
# self.input_num values.
temp = fsum(i*w for (i, w) in zip(self.inputs, self.h_weight))
hidden = [1/(1+exp(-temp))]*self.hidden_num
# This assumes that both outputs and o_weight have exactly
# self.output_num values.
temp = fsum(o*w for (o, w) in zip(self.outputs, self.o_weight))
self.output = [1/(1+exp(-temp))]*self.output_num
I have neither tested that this works the same as your code (or even
works at all!) nor that it is faster, but I would expect that it will be
faster.
Good luck!
--
Steven
More information about the Python-list
mailing list