# improving a huge double-for cycle

Tim Chase python.list at tim.thechases.com
Thu Sep 18 14:57:16 CEST 2008

```> Code: Select all
>     for i in range(len(IN)): #scan all elements of the list IN
>       for j in range(len(IN)):
>         if i <> j:
>          if IN[i].coordinates[0] == IN[j].coordinates[0]:
>            if IN[i].coordinates[1] == IN[j].coordinates[1]:
>               SN.append(IN[i].label)
>
>
> Unfortunately my len(IN) is about 100.000 and the running time about
> 15h !!!! :(
>
> Any idea to improve it?
[snip]
> I have already tried to group the "if statements" in a single one:
>
> Code: Select all
>     if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
>
> but no improvements.

It's like rearranging deck-chairs on the Titanic :)  Yes, it may
give a speed up, but what's 3 seconds when you're waiting 15hr :)

Not knowing the len(IN[x].coordinates) or their structure, if
it's a list of len==2, you should be able to just do

if i <> j and IN[i].coordinates == IN[j].coordinates

or

if i <> j and IN[i].coordinates[:2] == IN[j].coordinates[:2]

However, again, this is just polish.  The big problem is that you
have an O(N^2) algorithm that's killing you.

1) use xrange instead of range to save eating memory with a huge
unneeded array.

2) unless you need to append duplicate labels, you know that when
I and J are swapped, you'll reach the same condition again, so it
might be worth writing the outer loops to eliminate this
scenario, and in the process, but just starting at i+1 rather
than i, you can forgo the check if "i<>j".

Such changes might look something like

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN[i].coordinates == IN[j].coordinates:
SN.append(IN[i].label)

If my college algorithms memory serves me sufficiently, this
reduces your O(N^2) to O(N log N) which will garner you some
decent time savings.

-tkc

```