Best way to structure data for efficient searching
larry.martell at gmail.com
Wed Mar 28 20:39:54 CEST 2012
I have the following use case:
I have a set of data that is contains 3 fields, K1, K2 and a
timestamp. There are duplicates in the data set, and they all have to
Then I have another set of data with 4 fields: K3, K4, K5, and a
timestamp. There are also duplicates in that data set, and they also
all have to be processed.
I need to find all the items in the second data set where K1==K3 and
K2==K4 and the 2 timestamps are within 20 seconds of each other.
I have this working, but the way I did it seems very inefficient - I
simply put the data in 2 arrays (as tuples) and then walked through
the entire second data set once for each item in the first data set,
looking for matches.
Is there a better, more efficient way I could have done this?
More information about the Python-list