parsing CSV files with quotes

Chris Ryland cpr at emsoftware.com
Fri Mar 31 22:59:53 CEST 2000


Warren--

BTW, have you seen
<ftp://ftp.python.org/pub/python/contrib-09-Dec-1999/Database/exported.READM
E>?
--
Cheers!
/ Chris Ryland, President / Em Software, Inc. / www.emsoftware.com

"Warren Postma" <embed at geocities.com> wrote in message
news:BDLE4.1642$HG1.47883 at nnrp1.uunet.ca...
> Suppose I have a CSV file where line 1 is the column names, and lines 2..n
> are comma separated variables, where all String fields are quoted like
this:
>
> ID, NAME, AGE
> 1, "Postma, Warren", 30
> 2, "Twain, Shania",  31
> 3, "Nelson, Willy",  57
> 4, "Austin, \"Stone Cold\" Steve", 34
>
> So, the obvious thing I tried is:
>
> import string
> >>> print string.splitfields("4, \"Austin, \\\"Stone Cold\\\" Steve,
> 34",",")
> ['4', ' "Austin', ' \\"Stone Cold\\" Steve', ' 34']
>
> Hmm. Interesting. So I tried this:
>
> >>> print string.splitfields(r'4, "Austin, \"Stone Cold\" Steve", 34')
> ['4,', '"Austin,', '\\"Stone', 'Cold\\"', 'Steve",', '34']
>
> I'm getting close, I can feel it!
>
> The Rules:
>
> 1. All integer and other fields are output as ascii.
> 2. String fields have quotes. Commas are allowed inside the quotes.
> 3. Quotes inside quotes are escaped by a backslash
> 4. Backslashes are themselves quoted by a backslash
>
> Is this complex enough that I basically need the "parser" module of
Python?
>
> Problem is I'm scared of it. Anyone got any Parser Tutorials Howtos/Links?
>
> Or is this beasty solveable by judicious use of Regular Expressions?
>
> While I'm taking up bandwidth, I'll ask another silly question:
>
> Is there a "compressed dbShelve" out there anywhere? In this case I just
> want to store arrays and dictionaries of built-in Python types, in a
> compressed manner, in a bsd database. Anyone heard of something like this?
>
> Warren
>
>


begin 666 exported.README.url
M6T1%1D%53%1=#0I"05-%55),/69T<#HO+V9T<"YP>71H;VXN;W)G+W!U8B]P
M>71H;VXO8V]N=')I8BTP.2U$96,M,3DY.2]$871A8F%S92]E>'!O<G1E9"Y2
M14%$344-"EM);G1E<FYE=%-H;W)T8W5T70T*55),/69T<#HO+V9T<"YP>71H
M;VXN;W)G+W!U8B]P>71H;VXO8V]N=')I8BTP.2U$96,M,3DY.2]$871A8F%S
M92]E>'!O<G1E9"Y214%$344-"DUO9&EF:65D/30P.$$S.# W-3,Y0D)&,#%"
#-PT*
`
end




More information about the Python-list mailing list