[Tutor] using python to read csv clean record and write out csv

Sacha Rook sacharook at gmail.com
Fri Nov 2 11:40:18 CET 2012


Hi

I have a problem with a csv file from a supplier, so they export data to
csv however the last column in the record is a description which is marked
up with html.

trying to automate the processing of this csv to upload elsewhere in a
useable format. If i open the csv with csved it looks like all the records
aren't escaped correctly as after a while i find html tags and text on the
next line/record.

If I 'openwith' excel the description stays on the correct line/record?

I want to use python to read these records in and output a valid csv with
the descriptions intact preferably without the html tags so a string of
text formatted with newline/CR where appropriate.

So far I have this but don't know where to go from here can someone help me?

import csv

infile = open('c:\data\input.csv', 'rb')
outfile = open('c:\data\output.csv', 'wb')

reader = csv.reader(infile)
writer = csv.writer(outfile)


for line in reader:
    print line
    writer.writerow(line)

I have attached the input.csv i hope this is good form here?

I know I am far from complete but don't know how to proceed :-)

Thanks all
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20121102/b58c1162/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: input.csv
Type: text/csv
Size: 3869406 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/tutor/attachments/20121102/b58c1162/attachment-0001.csv>


More information about the Tutor mailing list