[SciPy-User] optimize a piece of code pruning arrays based on distance
William Heymann
immudzen at gmail.com
Wed Jan 24 05:44:31 EST 2018
Hello,
[Sorry if this is a duplicate. I realized I sent my previous email from the
wrong address]
I am trying to find a way to use scipy/numpy to optimize a piece of code.
>From profiling I already know over 90% of the runtime of the function is in
4 lines of code. This code is from selSPEA2 in DEAP in case that matters.
for i in range(N):
for j in range(1, size - 1):
if sorted_indices[i,j] == min_pos:
sorted_indices[i,j:size - 1] = sorted_indices[i,j + 1:size]
sorted_indices[i,j:size] = min_pos
break
The inner loop is inherently parallel since it is each iteration does not
depend on any other. With C++ I would use OpenMP or TBB to parallelize the
outer loop.
For a more complete picture I have also included the surrounding code. The
basic problem is if the number of entries (size) is greater than the number
allowed to be kept (k) for the next generation then the most similar items
(based on distance) need to be removed until size == k. At each iteration a
solution with the lowest distance to another solution is removed and then
the distance from the remove solution to all other solutions is set as
infinity. This prevents removing all points in a cluster.
Thank you
while size > k:
# Search for minimal distance
min_pos = 0
for i in range(1, N):
for j in range(1, size):
dist_i_sorted_j = distances[i,sorted_indices[i,j]]
dist_min_sorted_j = distances[min_pos,sorted_indices[min_pos,j]]
if dist_i_sorted_j < dist_min_sorted_j:
min_pos = i
break
elif dist_i_sorted_j > dist_min_sorted_j:
break
distances[:,min_pos] = float("inf")
distances[min_pos,:] = float("inf")
#This part is still expensive but I don't know a better way to do it
yet.
#Essentially all remaining time in this function is in this section
#It may even make sense to do this in C++ later since it is trivially
parallel
for i in range(N):
for j in range(1, size - 1):
if sorted_indices[i,j] == min_pos:
sorted_indices[i,j:size - 1] = sorted_indices[i,j + 1:size]
sorted_indices[i,j:size] = min_pos
break
# Remove corresponding individual from chosen_indices
to_remove.append(min_pos)
size -= 1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-user/attachments/20180124/e7386c58/attachment.html>
More information about the SciPy-User
mailing list