regexp non-greedy matching bug?
Fredrik Lundh
fredrik at pythonware.com
Mon Dec 5 03:31:56 EST 2005
Aahz wrote:
> While you're technically correct, I've been bitten too many times by
> forgetting whether to use match() or search(). I've fixed that problem
> by choosing to always use search() and combine with ^ as appropriate.
that's a bit suboptimal, though, at least for cases where failed matches
are rather common:
C:\>timeit -s "import re; p = re.compile('b')" "p.match('a'*100)"
100000 loops, best of 3: 6.14 usec per loop
C:\>timeit -s "import re; p = re.compile('^b')" "p.match('a'*100)"
100000 loops, best of 3: 6.25 usec per loop
C:\>timeit -s "import re; p = re.compile('^b')" "p.search('a'*100)"
100000 loops, best of 3: 15.4 usec per loop
(afaik, search doesn't have any heuristics for figuring out if it can skip
the search, so it'll check ^ against all available positions)
on the other hand, benchmarking RE:s always results in confusing
results:
C:\>timeit -s "import re; p = re.compile('b')" "p.search('a'*100)"
100000 loops, best of 3: 4.32 usec per loop
(should this really be *faster* than match for this case ?)
</F>
More information about the Python-list
mailing list