removing duplicates from .csv files

Dave Cole djc at object-craft.com.au
Thu Jan 25 17:45:21 EST 2001


>>>>> "mspiggie" == mspiggie  <mspiggie at my-deja.com> writes:

mspiggie> I have been given several comma-delimited (.csv) files, each
mspiggie> containing as many as several thousand lines of entries.
mspiggie> Among the tasks I've been charged with is to remove
mspiggie> duplicate entries.  The files each contain fields for
mspiggie> Contact Name, Company Name, Phone Number, and Address, among
mspiggie> other fields, which vary from file to file.

Although I can't offer much help on the data processing part, you
might want to use my fast CSV parser module.  It understands MS style
CSV in which quoted fields can span multiple lines.  It runs just a
teeny bit slower than string.split()

        http://www.object-craft.com.au/projects/csv/

- Dave

-- 
http://www.object-craft.com.au



More information about the Python-list mailing list