idiom for RE matching
Reddy
reddyist at gmail.com
Thu Jul 19 19:03:23 EDT 2007
On 7/19/07, Gordon Airporte <JHoover at fbi.gov> wrote:
>
> I have some code which relies on running each line of a file through a
> large number of regexes which may or may not apply. For each pattern I
> want to match I've been writing
>
> gotit = mypattern.findall(line)
Try to use iterator function finditer instead of findall. To see the
difference run below code by calling findIter or findAll function one at a
time in for loop. You can have achieve atleast 4x better performance.
-----------------------------------------------------------------------------------
import re
import time
m = re.compile(r'(\d+/\d+/\d+)')
line = "Today's date is 21/07/2007 then yesterday's 20/07/2007"
def findIter(line):
m.finditer(line)
glist = [x.group(0) for x in g]
def findAll(line):
glist = m.findall(line)
start = time.time()
for i in xrange(1000000):
#findIter(line)
findAll(line)
end = time.time()
print end-start
--------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20070720/e8441f56/attachment.html>
More information about the Python-list
mailing list