# improving a huge double-for cycle

Peter Otten __peter__ at web.de
Thu Sep 18 14:45:34 CEST 2008

```Alexzive wrote:

> Hello there :) ,
>
> I am a python newbie and need to run following code for a task in an
> external simulation programm called "Abaqus" which makes use of python
> to access the mesh (ensamble of nodes with xy coordinates) of a
> certain geometrical model.
>
> [IN is the starting input containing the nodes to be check, there are
> some double nodes with the same x and y coordinates which need to be
> removed. SN is the output containing such double nodes]
>
> Code: Select all
>     for i in range(len(IN)): #scan all elements of the list IN
>       for j in range(len(IN)):
>         if i <> j:
>          if IN[i].coordinates == IN[j].coordinates:
>            if IN[i].coordinates == IN[j].coordinates:
>               SN.append(IN[i].label)
>
>
>
> Unfortunately my len(IN) is about 100.000 and the running time about
> 15h !!!! :(

> Any idea to improve it?
>
> I have already tried to group the "if statements" in a single one:
>
> Code: Select all
>     if i <> j and if IN[i].coordinates == IN[j].coordinates and
> if IN[i].coordinates == IN[j].coordinates:
>
>
> but no improvements.
>
> Many thanks, Alex

When you're looking for duplicates an efficient solution is likely to be
based on a set or dict object.

# untested
from collections import defaultdict

groups = defaultdict(list)
for item in IN:
c = item.coordinates
groups[c, c].append(item.label)
SN = []
for labels in groups.itervalues():
if len(labels) > 1:
SN.extend(labels) # or labels[1:] if you want to keep one item

Peter

```