How to split a string containing nested commas-separated substrings
omer at no-log.org
Wed Jun 18 20:12:33 CEST 2008
Le Wednesday 18 June 2008 19:19:57 Robert Dodier, vous avez écrit :
> I'd like to split a string by commas, but only at the "top level" so
> to speak. An element can be a comma-less substring, or a
> quoted string, or a substring which looks like a function call.
> If some element contains commas, I don't want to split it.
> 'foo, bar, baz' => 'foo' 'bar' 'baz'
> 'foo, "bar, baz", blurf' => 'foo' 'bar, baz' 'blurf'
> 'foo, bar(baz, blurf), mumble' => 'foo' 'bar(baz, blurf)' 'mumble'
> Can someone suggest a suitable regular expression or other
> method to split such strings?
I'd do something like this (note that it doesn't check for quote/parenthesis
mismatch and removes _all_ the quotes) :
def mysplit (string) :
pardepth = 0
quote = False
ret = ['']
for car in string :
if car == '(' : pardepth += 1
elif car == ')' : pardepth -= 1
elif car in ('"', "'") :
quote = not quote
car = '' # just if you don't want to keep the quotes
if car in ', ' and not (pardepth or quote) :
if ret[-1] != '' : ret.append('')
ret[-1] += car
for s in ('foo, bar, baz',
'foo, "bar, baz", blurf',
'foo, bar(baz, blurf), mumble') :
print "'%s' => '%s'" % (s, mysplit(s))
'foo, bar, baz' => '['foo', 'bar', 'baz']'
'foo, "bar, baz", blurf' => '['foo', 'bar, baz', 'blurf']'
'foo, bar(baz, blurf), mumble' => '['foo', 'bar(baz, blurf)', 'mumble']'
More information about the Python-list