Compiling regex inside function?
Diez B. Roggisch
deets at nospam.web.de
Mon Aug 3 18:01:14 CEST 2009
Anthra Norell wrote:
> Hi all,
> I have a regex that has no use outside of a particular function. From
> an encapsulation point of view it should be scoped as restrictively as
> possible. Defining it inside the function certainly works, but if
> re.compile () is run every time the function is called, it isn't such a
> good idea after all. E.g.
> def entries (l):
> r = re.compile ('([0-9]+) entr(y|ies)')
> match = r.search (l)
> if match: return match.group (1)
> So the question is: does "r" get regex-compiled once at py-compile time
> or repeatedly at entries() run time?
This can't be answered as simple yes/no-question.
While the statement is executed each time, the resulting pattern-object
isn't re-created, instead there is a caching-mechanism inside the module -
so unless you create a situation where that cache's limits are exceeded and
pattern objects are removed from it, you are essentially having the
overhead of one function-call & a dict-lookup. Certainly worth it.
As an additional note: r"" has *nothing* todo with this, that's just
so-called raw string literals which have a different escaping-behavior -
thus it's easier to write regexes in them.
More information about the Python-list