[Tutor] Re: removal of duplicates from .csv files

alan.gauld@bt.com alan.gauld@bt.com
Fri, 26 Jan 2001 17:41:47 -0000


> > I have been given several comma-delimited (.csv) files, 
> > charged with is to remove duplicate entries.  

> One approach you may want to consider is to create a 
> dictionary with the phone number and/or address as a key.

That was the approach I was going to suggest provided 
you have enough memory...

One question you must answer is which duplicate you want to emilinate.
Assuming only the 2 key fields are duplicates 
which of the other data is the riught one to keep?

If its always the first one then thats easier using 
sort and a custom compare function, if its always the 
last one thats easier with a dictionary...

Alan G.