Help with regular expression using findall and .*?
darrell
dgallion1 at yahoo.com
Sat Sep 14 09:28:56 EDT 2002
Here's an example with backtracking turned off
>>> s="""a\nb\n1"""
>>> re.findall("[^\n]+?\d", s)
['a\nb\n1']
Which is not correct.
What ever pattern proceeds the '+' must be reevaluated as the pattern moves
forward. sre handles this though recursion.
import re
s2=('macro\n'+'a'*20000+'\norcam\n')*10
s2split=re.split("macro\n|\norcam\n",s2)
for r in s2split:
print r
This should be fast also.
--Darrell
czrpb wrote:
> Harvey:
>
> Great thanks!! And thanks for sticking to my question's requirements.
> <wink!>
>
> Ok, this is what we thought around here. But what I do not understand is
> why any backtracking data is being kept? The '?' in '.*?' means it is
> non-greedy right? When would backtracking ever occur using '.*?'? What am
> I missing?
>
More information about the Python-list
mailing list