parsing CSV files with quotes
Niklas Frykholm
r2d2 at acc.umu.se
Fri Mar 31 03:04:08 EST 2000
On Thu, 30 Mar 2000 11:50:27 -0500, Warren Postma <embed at geocities.com> wrote:
>Suppose I have a CSV file where line 1 is the column names, and lines 2..n
>are comma separated variables, where all String fields are quoted like this:
>
>ID, NAME, AGE
>1, "Postma, Warren", 30
>2, "Twain, Shania", 31
>3, "Nelson, Willy", 57
>4, "Austin, \"Stone Cold\" Steve", 34
[...]
>Or is this beasty solveable by judicious use of Regular Expressions?
Sure...
def dec(s):
"Decode \\" and \\\\."
return re.sub(r"\\(.)", r"\1", s)
def csv_parse(s):
"Parse csv-string."
return map(lambda x: (x[0] and [int(x[0])] or [dec(x[1])])[0],
re.findall(r'\s*([^"]+?)\s*,|\s*"(.*?[^\\])"\s*,',s+","))
>>>s = r'4, "Austin, \"Stone Cold\" Steve", 34'
>>>print csv_parse(s)
[4, 'Austin, "Stone Cold" Steve', 34]
// Niklas
More information about the Python-list
mailing list