Best way to structure data for efficient searching

Larry.Martell@gmail.com larry.martell at gmail.com
Wed Mar 28 14:39:54 EDT 2012


I have the following use case:

I have a set of data that is contains 3 fields, K1, K2 and a
timestamp. There are duplicates in the data set, and they all have to
processed.

Then I have another set of data with 4 fields: K3, K4, K5, and a
timestamp. There are also duplicates in that data set, and they also
all have to be processed.

I need to find all the items in the second data set where K1==K3 and
K2==K4 and the 2 timestamps are within 20 seconds of each other.

I have this working, but the way I did it seems very inefficient - I
simply put the data in 2 arrays (as tuples) and then walked through
the entire second data set once for each item in the first data set,
looking for matches.

Is there a better, more efficient way I could have done this?



More information about the Python-list mailing list