Trying to fix Invalid CSV File
Emile van Sebille
emile at fenx.com
Mon Aug 4 01:38:22 EDT 2008
Ryan Rosario wrote:
> I have a very large CSV file that contains double quoted fields (since
> they contain commas). Unfortunately, some of these fields also contain
> other double quotes and I made the painful mistake of forgetting to
> escape or double the quotes inside the field:
>
> 123,"Here is some, text "and some quoted text" where the quotes should
> have been doubled",321
>
rec = '''123,"Here is some, text "and some quoted text" where the quotes
should have been doubled",321'''
import csv
csv.reader([rec.replace(',"',',"""')
.replace('",','""",')
.replace('"""',"'''")
.replace('"','""')
.replace("'''",'"')]).next()
['123', 'Here is some, text "and some quoted text" where the quotes
should have been doubled', '321']
:))
Emile
> Has anyone dealt with this problem before? Any ideas of an algorithm I
> can use for a Python script to create a new, repaired CSV file?
>
> TIA,
> Ryan
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list