[Tutor] reading Excel CSVs? (was: Re: recommended CSV module?)

Roger Merchberger zmerch at 30below.com
Thu May 13 15:55:25 EDT 2004


At 11:37 AM 5/13/2004 -0400, Lance E Sloan wrote:

I've never used the CSV module, I've always profiled my data & split it by 
hand, but...

>  Some of the CSV:
>
>  PHRASE,ITEM,MONTH,YEAR,PAGE^M"Investments, 2001-2002",Alternative asset
>  commitment approved with Yorktown Energy Partners V
>  L.P.,April,2002,283^M"Investments, 2001-2002",Alternative asset
>  commitment approved with BPG Investment Partnership VI
>  L.P.,April,2002,283^M
>
>That CSV came directly from Excel.  Note that the "^M" shown here are 
>actually carriage returns in the file.  I just loaded it into vi and 
>copied a few lines into this message.
>
>When I run the program, I get this error:
>
>  Traceback (most recent call last):
>    File "samplecsv", line 3, in ?
>      for row in reader:
>  _csv.Error: newline inside string
>
>I've tried telling the csv.reader that the lineterminator is '\n' or '\r' 
>in place of its default '\n\r', but neither worked.
>
>How do other folks use the csv module to read Excel files?

Quick-n-dirty?

Load the CSV into StarOffice / OpenOffice, then resave.

Most programs put quotes around all strings *no matter what* and only leave 
numerical fields unquoted. Excel puts quotes around strings *only if they 
contain commas*. Very odd behaviour, and fairly difficult to parse.

>BTW, I'm running this on a Mac OS X machine.  I've tried this with CSVs 
>from Excel running on both Windows and Mac and it comes out the same. 
>(Nice that it's consistent.)

Maybe across platforms, but not across different versions of Excel... 
sometimes they change the behaviour a bit during updates. Very odd.

If all your data is string-based (a.k.a. no numerical fields) I usually 
load/resave with StarOffice, then split on quotes instead of commas, then 
take every other field... but that's just me.

I have some sample code where I've done just that - if you'd like to see 
it, email me offlist & i'd be happy to forward it to you.

HTH,
Roger "Merch" Merchberger

--
Roger "Merch" Merchberger   | "Bugs of a feather flock together."
sysadmin, Iceberg Computers |           Russell Nelson
zmerch at 30below.com          |




More information about the Tutor mailing list