newbie:unique problem

Brian van den Broek bvande at po-box.mcgill.ca
Thu Mar 17 17:31:59 EST 2005


Heiko Wundram said unto the world upon 2005-03-17 16:29:
> On Thursday 17 March 2005 20:08, Leeds, Mark wrote:
> 
>>But, I also want it to get rid of the AAA KP because
>>there are two AAA's even though the last two letters
>>are different. It doesn't matter to me which one
>>is gotten rid of but I don't know how to change
>>the function to handle this ? I have a feeling
>>it's not that hard though ? Thanks.
> 
> 
> Doing the same thing Brian van den Brook did with sets (also for 2.4 only):
> 
> def uniqueItems(oldlist,comppos=3):
>     rv = {}
>     for i in reversed(oldlist):
>         rv[i[:comppos]] = i
>     return rv.values()
> 
> 
>>>>uniqueItems(["AAA BC","BBB KK","CCC TD","AAA KP","CCC TD"])
> 
> ['AAA BC', 'BBB KK', 'CCC TD']
> 
> heiko at heiko ~ $ python2.4 /usr/local/lib/python2.4/timeit.py -s "import test; 
> uniqueItems = test.uniqueItems; uniqueItemsBrian = test.uniqueItemsBrian" 
> "uniqueItemsBrian(uniqueItems)"
> 100000 loops, best of 3: 13.8 usec per loop
> 
> heiko at heiko ~ $ python2.4 /usr/local/lib/python2.4/timeit.py -s "import test; 
> uniqueItems = test.uniqueItems; uniqueItemsHeiko = test.uniqueItemsHeiko" 
> "uniqueItemsHeiko(uniqueItems)"
> 100000 loops, best of 3: 9.28 usec per loop
> 
> Seems like the dictionary solution is faster, at least for n=3. Do your own 
> tests... ;)

I'm not surprised the dict approach is faster than mine. Mine was 
maintaining two data structures. But that was done so as to guarantee 
the output's ordering was the same as the input's, while also taking 
advantage of the lookup speed that sets and dicts provide. Am I not 
right in thinking that with the dict approach there is no guarantee 
that the order from the original list will be preserved?

Also, Heiko, I wonder what is the reason for reversed(oldlist)? Since 
the list isn't being mutated, there isn't any danger in forward 
iteration over it. (Plus, unless I'm mistaken, its the only thing 
making yours a 2.4-only solution.)

Best to all,

Brian vdB




More information about the Python-list mailing list