Trying to fix Invalid CSV File
Emile van Sebille
emile at fenx.com
Mon Aug 4 11:30:08 EDT 2008
John Machin wrote:
> On Aug 4, 6:15 pm, Ryan Rosario <uclamath... at gmail.com> wrote:
>> On Aug 4, 1:01 am, John Machin <sjmac... at lexicon.net> wrote:
>>
>>> On Aug 4, 5:49 pm, Ryan Rosario <uclamath... at gmail.com> wrote:
>>>> Thanks Emile! Works almost perfectly, but is there some way I can
>>>> adapt this to quote fields that contain a comma in them?
<snip>
> Emile's snippet is pushing it through the csv reading process, to
> demonstrate that his series of replaces works (on your *sole* example,
> at least).
Exactly -- just print out the results of the passed argument:
>>>
rec.replace(',"',",'''").replace('",',"''',").replace('"','""').replace("'''",'"')
'123,"Here is some, text ""and some quoted text"" where the quotes
should have been doubled",321'
Where it won't work is if any of the field embedded quotes are next to
commas.
I'd run it against the file. Presumably, you've got a consistent field
count expectation per record. Any resulting record not matching is
suspect and will identify records this approach won't address.
There's probably better ways, but sometimes it's fun to create
executable line noise. :)
Emile
More information about the Python-list
mailing list