Best way to structure data for efficient searching
Larry.Martell@gmail.com
larry.martell at gmail.com
Wed Mar 28 14:39:54 EDT 2012
I have the following use case:
I have a set of data that is contains 3 fields, K1, K2 and a
timestamp. There are duplicates in the data set, and they all have to
processed.
Then I have another set of data with 4 fields: K3, K4, K5, and a
timestamp. There are also duplicates in that data set, and they also
all have to be processed.
I need to find all the items in the second data set where K1==K3 and
K2==K4 and the 2 timestamps are within 20 seconds of each other.
I have this working, but the way I did it seems very inefficient - I
simply put the data in 2 arrays (as tuples) and then walked through
the entire second data set once for each item in the first data set,
looking for matches.
Is there a better, more efficient way I could have done this?
More information about the Python-list
mailing list