regular expression
Bengt Richter
bokr at oz.net
Sat Mar 26 02:41:54 EST 2005
On Fri, 25 Mar 2005 23:54:32 -0500, Peter Hansen <peter at engcorp.com> wrote:
>Bengt Richter wrote:
>> On Sat, 26 Mar 2005 02:07:15 GMT, aaron <asteele at berkeley.edu> wrote:
>>>>>>pattern.sub(':', '375 mi. south of U.C.B is 3.4 degrees warmer.')
>>>'375 mi: south of U:C:B is 3.4 degrees warmer:'
>>>
>>>so this works, but not in the following case:
>>>>>>pattern.sub(':', '.3')
>>>
>> Brute force the exceptional case that happens at the start of the line?
>>
>> >>> import re
>> >>> pattern = re.compile(r'^[.]|(?!\d)[.](?!\d)')
>> >>> pattern.sub(':', '375 mi. south of U.C.B is 3.4 degrees warmer.')
>> '375 mi: south of U:C:B is 3.4 degrees warmer:'
>> >>> pattern.sub(':', '.3')
>> ':3'
>> >>> pattern.sub(':', '3.')
>> '3:'
>
>Be careful... the OP has assumed something that isn't true,
>and Bengt's fix isn't sufficient:
>
> >>> import re
> >>> s = 'x.3'
> >>> pattern = re.compile(r'^[.]|(?!\d)[.](?!\d)')
> >>> pattern.sub(':', '.3')
>':3'
> >>> pattern.sub(':', s)
>'x.3'
>
>So the OP's "this works" comment was wrong.
>
>Suggestion: whip up a variety of automated test cases and
>make sure you run them all whenever you make changes to
>this code...
>
>(No, I don't have a solution to the continuing problem,
>other than to wonder whether the input data really requires
>all these edge cases to be handled properly.)
>
Goes to show you ;-/ Do we need more tests than these?
>>> import re
>>> pattern = re.compile(r'[.](?!\d)|(?<!\d)[.]')
>>> print pattern.sub(':', '375 mi. south of U.C.B is 3.4 degrees warmer.')
375 mi: south of U:C:B is 3.4 degrees warmer:
>>> for s,ss in ((s,pattern.sub(':', s)) for s in ('%s%s.%s%s'%(sp1,c1,c2,sp2)
... for sp1 in ('', ' ')
... for c1 in ('', 'x', '3')
... for c2 in ('', 'x', '3')
... for sp2 in ('', ' '))):
... print '%10r => %r' %(s,ss)
...
'.' => ':'
'. ' => ': '
'.x' => ':x'
'.x ' => ':x '
'.3' => ':3'
'.3 ' => ':3 '
'x.' => 'x:'
'x. ' => 'x: '
'x.x' => 'x:x'
'x.x ' => 'x:x '
'x.3' => 'x:3'
'x.3 ' => 'x:3 '
'3.' => '3:'
'3. ' => '3: '
'3.x' => '3:x'
'3.x ' => '3:x '
'3.3' => '3.3'
'3.3 ' => '3.3 '
' .' => ' :'
' . ' => ' : '
' .x' => ' :x'
' .x ' => ' :x '
' .3' => ' :3'
' .3 ' => ' :3 '
' x.' => ' x:'
' x. ' => ' x: '
' x.x' => ' x:x'
' x.x ' => ' x:x '
' x.3' => ' x:3'
' x.3 ' => ' x:3 '
' 3.' => ' 3:'
' 3. ' => ' 3: '
' 3.x' => ' 3:x'
' 3.x ' => ' 3:x '
' 3.3' => ' 3.3'
' 3.3 ' => ' 3.3 '
Regards,
Bengt Richter
More information about the Python-list
mailing list