Precompiled regular expressions slower?

Andy Gimblett gimbo at ftech.net
Tue Feb 26 18:07:51 CET 2002


On Tue, Feb 26, 2002 at 11:47:27AM -0500, Peter Bienstman wrote:
> Sticking with readlines rather than xreadlines for the moment to compare
> apples to apples, these are the timings:
> 
> precompiled as global var: 4.7 sec
> precompiled as local var : 4.5 sec
> not precompiled          : 3.9 sec
> 
> It helps somewhat, but doesn't solve the problem.

Do you get similar results when you ramp up the number of searches
being executed?

Disclaimer: I don't know if what I'm about to suggest is true, I
haven't looked at the code, tried this out or measured anything, I'm
guessing, I should be ignored, etc.  ;-)

Possible mechanism for what's happening:

    - Setting up a compiled re object takes time X

    - Executing one search using compiled re object takes time x

    - Thus, n searches using compiled object takes X + (n * x)

    - Executing search using re.search() takes time y

    - "Setup" is included in that time (basically consists of
      following reference to re.search())

    - Thus, n searches using re.search() takes n * y

I'm thinking maybe n is small enough and X is large enough that even
though x _is_ smaller than y, X + (n * x) is still larger than n * y

The question is, how many searches do you need to perform to make
compiled regular expressions worthwhile?

Just a thought...  As I said, I could be completely barking up the
wrong tree.  If you _do_ get similar results for much bigger values of
n, I'd guess I'm wrong.  But if that's the case then yeah, what the
heck are precompiled regular expressions for? :-)

-Andy

-- 
Andy Gimblett - Programmer - Frontier Internet Services Limited
Tel: 029 20 820 044 Fax: 029 20 820 035 http://www.frontier.net.uk/
Statements made are at all times subject to Frontier's Terms and
Conditions of Business, which are available upon request.




More information about the Python-list mailing list