Escaping commas within parens in CSV parsing?
skip at pobox.com
Fri Jul 1 04:59:10 CEST 2005
Ramon> I am trying to use the csv module to parse a column of values
Ramon> containing comma-delimited values with unusual escaping:
Ramon> AAA, BBB, CCC (some text, right here), DDD
Ramon> I want this to come back as:
Ramon> ["AAA", "BBB", "CCC (some text, right here)", "DDD"]
Alas, there's no "escaping" at all in the line above. I see no obvious way
to distinguish one comma from another in this example. If you mean the fact
that the comma you want to retain is in parens, that's not escaping. Escape
characters don't appear in the output as they do in your example.
Ramon> I can probably hack this with regular expressions but I thought
Ramon> I'd check to see if anyone had any quick suggestions for how to
Ramon> do this elegantly first.
I see nothing obvious unless you truly mean that the beginning of each field
is all caps. In that case you could wrap a file object and :
def __init__(self, f):
self.f = f
return '"' + re.sub(r',( *[A-Z]+)', r'","\1', self.f.next()) + '"'
and use it like so:
reader = csv.reader(FunnyWrapper(open("somefile.csv", "rb")))
for row in reader:
(I'm not sure what the ramifications are of iterating over a file opened in
More information about the Python-list