regular expression for space seperated quoted string

Eric Brunel eric.brunel at pragmadev.com
Wed Sep 11 14:17:03 CEST 2002


Padraig Brady wrote:

> Hi, I'm trying to split a string that is seperated
> by spaces and also contains double quoted words which
> can contain spaces:
> 
> For e.g. using: re.split('[ ]*"?([^"]*)"?[ ]*', s1)
> on this string: s1='1 "2" "thre e"'
> gives:          ['', '1 ', '', '2', '', 'thre e', '']
> 
> Problem with this is the '' entries, but this isn't too bad.
> 
> However using the above re with: s2='1 2 "th ree"'
> I get:                           ['', '1 2 ', '', 'th ree', '']
> 
> any ideas?

What about:
>>> p = r'[^ \t\n\v\f"]+|"[^"]*"'
>>> re.findall(p, '1 2 3')
['1', '2', '3']
>>> re.findall(p, '1 2 "three"')
['1', '2', '"three"']
>>> re.findall(p, '1 "2" "thr ee"')
['1', '"2"', '"thr ee"']
>>> re.findall(p, '"yeah it seems to be working!" yeah it seems to...')
['"yeah it seems to be working!"', 'yeah', 'it', 'seems', 'to...']

It leaves the double-quotes around the values, but they can be removed 
quite simply...

I also tried to use look-ahead and look-behind features in re's, but it 
doesn't have the expected result:

>>> re.findall(r'(?<=")[^"]*(?=")|[^ \t\n\v\f"]+',
        '"yeah it seems to be working!" yeah it seems to...')
['yeah it seems to be working!', 'yeah', 'it', 'seems', 'to...']
>>> re.findall(r'(?<=")[^"]*(?=")|[^ \t\n\v\f"]+', '1 2 3')
['1', '2', '3']
>>> re.findall(r'(?<=")[^"]*(?=")|[^ \t\n\v\f"]+', '1 "2" "thr ee"')
['1', '2', ' ', 'thr ee']

The " " between 2 and three is also matched...

HTH
-- 
- Eric Brunel <eric.brunel at pragmadev.com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com



More information about the Python-list mailing list