Performance on local constants?
Duncan Booth
duncan.booth at invalid.invalid
Sat Dec 22 07:05:00 EST 2007
William McBrine <wmcbrine at users.sf.net> wrote:
> Hi all,
>
> I'm pretty new to Python (a little over a month). I was wondering -- is
> something like this:
>
> s = re.compile('whatever')
>
> def t(whatnot):
> return s.search(whatnot)
>
> for i in xrange(1000):
> print t(something[i])
>
> significantly faster than something like this:
>
> def t(whatnot):
> s = re.compile('whatever')
> return s.search(whatnot)
>
> for i in xrange(1000):
> result = t(something[i])
>
> ? Or is Python clever enough to see that the value of s will be the same
> on every call, and thus only compile it once?
>
The best way to answer these questions is always to try it out for
yourself. Have a look at 'timeit.py' in the library: you can run
it as a script to time simple things or import it from longer scripts.
C:\Python25>python lib/timeit.py -s "import re;s=re.compile('whatnot')" "s.search('some long string containing a whatnot')"
1000000 loops, best of 3: 1.05 usec per loop
C:\Python25>python lib/timeit.py -s "import re" "re.compile('whatnot').search('some long string containing a whatnot')"
100000 loops, best of 3: 3.76 usec per loop
C:\Python25>python lib/timeit.py -s "import re" "re.search('whatnot', 'some long string containing a whatnot')"
100000 loops, best of 3: 3.98 usec per loop
So it looks like it takes a couple of microseconds overhead if you
don't pre-compile the regular expression. That could be significant
if you have simple matches as above, or irrelevant if the match is
complex and slow.
You can also try measuring the compile time separately:
C:\Python25>python lib/timeit.py -s "import re" "re.compile('whatnot')"
100000 loops, best of 3: 2.36 usec per loop
C:\Python25>python lib/timeit.py -s "import re" "re.compile('<(?:p|div)[^>]*>(?P<pat0>(?:(?P<atag0>\\<a[^>]*\\>)\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>\\</a\\>)|\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>)</(?:p|div)>|(?P<pat1>(?:(?P<atag1>\\<a[^>]*\\>)\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>\\</a\\>)|\\<img[^>]+class\\s*=[^=>]*captioned[^>]+\\>)')"
100000 loops, best of 3: 2.34 usec per loop
It makes no difference whether you use a trivial regular expression
or a complex one: Python remembers (if I remember correctly) the last
100 expressions it compiled,so the compilation overhead will be pretty
constant.
More information about the Python-list
mailing list