<table cellspacing="0" cellpadding="0" border="0" ><tr><td valign="top" style="font: inherit;">I'm relatively new at Python and I'm trying to write a function that fills a dictionary acording the following rules and (example) data:<br><br>Rules:<br>* No duplicate values in field1<br>* No duplicates values in field2 and field3 simultaneous (highest value in field4 has to be preserved)<br><br><br>Rec.no field1, field2, field3, field4<br>1. abc, def123, ghi123, 120 <-- new, insert in dictionary<br>2. abc, def123, ghi123, 120 <-- duplicate with 1. field4 same value. Do not insert in dictionary<br>3. bcd, def123, jkl125, 154 <-- new, insert in dictionary<br>4. efg, def123, jkl125, 175 <-- duplicate with 3 in field 2 and 3, but higher value in field4. Remove 3. from dict and replace with 4.<br>5. hij, ghi345, jkl125, 175 <-- duplicate field3, but not in field4. New, insert in dict.<br><br><br>The resulting dictionary should be:<br><br>hij
{'F2': ' ghi345', 'F3': ' jkl125', 'F4': 175}<br>abc {'F2': ' def123', 'F3': ' ghi123', 'F4': 120}<br>efg {'F2': ' def123', 'F3': ' jkl125', 'F4': 175}<br><br>This is wat I came up with up to now, but there is something wrong with it. The 'bcd' should have been removed. When I run it it says:<br><br>bcd {'F2': ' def123', 'F3': ' jkl125', 'F4': 154}<br>hij {'F2': ' ghi345', 'F3': ' jkl125', 'F4': 175}<br>abc {'F2': ' def123', 'F3': ' ghi123', 'F4': 120}<br>efg {'F2': ' def123', 'F3': ' jkl125', 'F4': 175}<br><br>Below is wat I brew (simplified). It took me some time to figure out that I was looking at the wrong values the wrong dictionary. I started again, but am ending up with a lot of dictionaries and for x in y-loops. I think there is a simpler way to do this.<br><br>Can somebody point me in the right direction and
explain to me how to do this? (and maybe have an alternative for the nesting. Because I may need to compare more fields. This is only a simplified dataset).<br><br><br>######### not working <br>def createResults(field1, field2, field3, field4):<br> #check if field1 exists.<br> if not results.has_key(field1):<br> <br> if results.has_key(field2):<br> #check if field2
already exists<br> <br> if results.has_key(field3):<br> #check if field3 already exists<br> #retrieve value
field4<br> existing_field4 = results[field2][F4]<br> #retrieve value existing field1 in dict<br> existing_field1 =
results[field1]<br> <br> #perform highest value check<br> if int(existing_field4) < int(field4):<br>
#remove existing record from dict.<br> del results[existing_field1]<br> values = {}<br> values['F2'] =
field2<br> values['F3'] = field3<br> values['F4'] = field4<br> results[field1] =
values<br> else:<br> pass<br> else:<br>
pass<br> else:<br> values = {}<br> values['F2'] = field2<br> values['F3'] = field3<br> values['F4'] =
field4<br> results[field1] = values<br> else:<br> pass<br> <br> <br><br><br> <br>for line in open("file.csv"):<br> field1, field2, field3, field4 = line.split(',')<br> createResults(field1, field2, field3, int(field4))<br> #because this is quick
and dirty I had to get rid of the \n in the csv<br><br>for i in results.keys():<br> print i, '\t', results[i] <br>################<br><br>contents file.csv<br><br>abc, def123, ghi123, 120<br>abc, def123, ghi123, 120<br>bcd, def123, jkl125, 154<br>efg, def123, jkl125, 175<br>hij, ghi345, jkl125, 175<br><br></td></tr></table><br>