How to split a string containing nested commas-separated substrings
Cédric Lucantis
omer at no-log.org
Wed Jun 18 14:12:33 EDT 2008
Hi,
Le Wednesday 18 June 2008 19:19:57 Robert Dodier, vous avez écrit :
> Hello,
>
> I'd like to split a string by commas, but only at the "top level" so
> to speak. An element can be a comma-less substring, or a
> quoted string, or a substring which looks like a function call.
> If some element contains commas, I don't want to split it.
>
> Examples:
>
> 'foo, bar, baz' => 'foo' 'bar' 'baz'
> 'foo, "bar, baz", blurf' => 'foo' 'bar, baz' 'blurf'
> 'foo, bar(baz, blurf), mumble' => 'foo' 'bar(baz, blurf)' 'mumble'
>
> Can someone suggest a suitable regular expression or other
> method to split such strings?
>
I'd do something like this (note that it doesn't check for quote/parenthesis
mismatch and removes _all_ the quotes) :
def mysplit (string) :
pardepth = 0
quote = False
ret = ['']
for car in string :
if car == '(' : pardepth += 1
elif car == ')' : pardepth -= 1
elif car in ('"', "'") :
quote = not quote
car = '' # just if you don't want to keep the quotes
if car in ', ' and not (pardepth or quote) :
if ret[-1] != '' : ret.append('')
else :
ret[-1] += car
return ret
# test
for s in ('foo, bar, baz',
'foo, "bar, baz", blurf',
'foo, bar(baz, blurf), mumble') :
print "'%s' => '%s'" % (s, mysplit(s))
# result
'foo, bar, baz' => '['foo', 'bar', 'baz']'
'foo, "bar, baz", blurf' => '['foo', 'bar, baz', 'blurf']'
'foo, bar(baz, blurf), mumble' => '['foo', 'bar(baz, blurf)', 'mumble']'
--
Cédric Lucantis
More information about the Python-list
mailing list