regular expression for space seperated quoted string

Padraig Brady Padraig at Linux.ie
Wed Sep 11 08:25:17 EDT 2002


Eric Brunel wrote:
> Padraig Brady wrote:
> 
> 
>>Hi, I'm trying to split a string that is seperated
>>by spaces and also contains double quoted words which
>>can contain spaces:
>>
>>For e.g. using: re.split('[ ]*"?([^"]*)"?[ ]*', s1)
>>on this string: s1='1 "2" "thre e"'
>>gives:          ['', '1 ', '', '2', '', 'thre e', '']
>>
>>Problem with this is the '' entries, but this isn't too bad.
>>
>>However using the above re with: s2='1 2 "th ree"'
>>I get:                           ['', '1 2 ', '', 'th ree', '']
>>
>>any ideas?
> 
> 
> What about:
> 
>>>>p = r'[^ \t\n\v\f"]+|"[^"]*"'
>>>>re.findall(p, '1 2 3')
>>>
> ['1', '2', '3']
> 
>>>>re.findall(p, '1 2 "three"')
>>>
> ['1', '2', '"three"']
> 
>>>>re.findall(p, '1 "2" "thr ee"')
>>>
> ['1', '"2"', '"thr ee"']
> 
>>>>re.findall(p, '"yeah it seems to be working!" yeah it seems to...')
>>>
> ['"yeah it seems to be working!"', 'yeah', 'it', 'seems', 'to...']
> 
> It leaves the double-quotes around the values, but they can be removed 
> quite simply...
> 
> I also tried to use look-ahead and look-behind features in re's, but it 
> doesn't have the expected result:
> 
> 
>>>>re.findall(r'(?<=")[^"]*(?=")|[^ \t\n\v\f"]+',
>>>
>         '"yeah it seems to be working!" yeah it seems to...')
> ['yeah it seems to be working!', 'yeah', 'it', 'seems', 'to...']
> 
>>>>re.findall(r'(?<=")[^"]*(?=")|[^ \t\n\v\f"]+', '1 2 3')
>>>
> ['1', '2', '3']
> 
>>>>re.findall(r'(?<=")[^"]*(?=")|[^ \t\n\v\f"]+', '1 "2" "thr ee"')
>>>
> ['1', '2', ' ', 'thr ee']
> 
> The " " between 2 and three is also matched...
> 
> HTH

thanks. Yes findall is probably more appropriate:
Also there is a canonical perl soultion for this,
search for split in: http://www.perldoc.com/perl5.6/faq/perlfaq4.html

Pádraig.




More information about the Python-list mailing list