Troubles with CSV file

Peter Hansen peter at engcorp.com
Fri May 14 07:27:59 EDT 2004


Vladimir Ignatov wrote:

> I have a big CSV file, which I must read and do some processing with it.
> Unfortunately I can't figure out how to use standard *csv* module in my
> situation. The problem is that some records look like:
> 
> ""read this, man"", 1
> 
> which should be decoded back into the:
> 
>     "read this, man"
>     1

Do you have anything that already accepts this particular dialect?
It seems to me that the above could just as easily be interpreted
as three fields (using parentheses as delimiters) :

   (""read this) ( man"") ( 1)

Is it possible that what you have is not really any standard CSV
format, but just something home-brewed?  In that case, you may
well need to massage it before feeding it to the csv module.

Or, if you can define how your example works in terms of delimiters,
quoting and such, maybe there's a way to make the csv module handle
it without complaints.

As far as I can see, you want either the doubled quotation marks to
be treated as single quotation marks, or you want the outer quotation
marks to magically quote the whole string containing the comma even
though it contains the quotation marks already.  I don't think CSV
can handle the latter (and it's probably an impossible goal), so you
must really want the former.  In that case, unfortunately, you
are also screwed because the doubling of quotation marks must mean
that 'doublequote' is True, but then 'quotechar' must have been '"'
in the first place and that first field would now have triple quotes
around it, like the Excel dialect.

Can you just blindly substitute all double quotes with triple quotes
in the input string first?  That might be the easiest approach.

-Peter



More information about the Python-list mailing list