[Tutor] efficient method to search between two lists
Terry Carroll
carroll at tjc.com
Thu Mar 23 07:00:02 CET 2006
On Wed, 22 Mar 2006, Srinivas Iyyer wrote:
> I cannot think of any other smart method since these
> are the only two ways I know possible.
>
> Would any one please help me suggesting a neat and
> efficient way.
I'm thinking:
1) use sets to get subsets of both lists down to only those elements that
will match at least one element in another list;
2) loop and compare, but only on those elements you know will match
somewhere (because of step #1)
One try:
list_a = ['S83513\tNM_001117', 'X60435\tNM_001117',
'U75370\tNM_005035', 'U05861\tNM_001353',
'S68290\tNM_001353', 'D86864\tNM_145349',
'D86864\tNM_003693', 'D86864\tNM_145351',
'D63483\tNM_145349', 'D63483\tNM_003693',
'D63483\tNM_145351', 'S66427\tNM_002892',
'S57153\tNM_002892']
list_b = ['HC_G110\t1000_at\tS83513',
'HC_G110\t1001_at\tD63483',
'HC_G110\t1002_f_at\tD86864',
'HC_G112\t1003_s_at\tX60435',
'HC_G112\t1004_at\tS57153']
set_a = set([(x.split('\t'))[0] for x in list_a])
set_b = set([(x.split('\t'))[2] for x in list_b])
intersection = list(set_a & set_b)
for (whole_b, keypart_b) in \
[(x, x.split('\t')[2]) for x in list_b
if x.split('\t')[2] in intersection]:
for (whole_a, keypart_a) in \
[(x, x.split('\t')[0]) for x in list_a
if x.split('\t')[0] in intersection]:
if keypart_b == keypart_a:
print whole_b+whole_a
Gives as output:
HC_G110 1000_at S83513S83513 NM_001117
HC_G110 1001_at D63483D63483 NM_145349
HC_G110 1001_at D63483D63483 NM_003693
HC_G110 1001_at D63483D63483 NM_145351
HC_G110 1002_f_at D86864D86864 NM_145349
HC_G110 1002_f_at D86864D86864 NM_003693
HC_G110 1002_f_at D86864D86864 NM_145351
HC_G112 1003_s_at X60435X60435 NM_001117
HC_G112 1004_at S57153S57153 NM_002892
(Note, my tab settings are different from yours, hence the different
output.)
More information about the Tutor
mailing list