[Python-ideas] re.compile_lazy - on first use compiled regexes

Antoine Pitrou solipsis at pitrou.net
Thu Mar 28 14:20:39 CET 2013


On Thu, 28 Mar 2013 12:25:17 +1100
Steven D'Aprano <steve at pearwood.info> wrote:
> On 28/03/13 10:06, Terry Reedy wrote:
> > On 3/24/2013 3:38 AM, Stefan Behnel wrote:
> >> Gregory P. Smith, 24.03.2013 00:48:
> >>> In the absense of profiling numbers showing otherwise, i'd rather see all
> >>> forms of manual caching like the conditional checks or a keep=True go away
> >>> as it's dirty and encourages premature "optimization".
> >>
> >> +1
> >>
> >> If I had been "more aware" of the re internal cache during the last years,
> >> I would have avoided at least a couple of re.compile() calls in my code, I
> >> guess.
> >>
> >> Maybe this is something that the documentation of re.compile() can help
> >> with, by telling people explicitly that this apparently cool feature of
> >> pre-compiling actually has a drawback in it (startup time + a bit of memory
> >> usage) and that they won't notice a runtime difference in most cases anyway.
> >
> > With a decent re cache size, .compile seems more like an attractive nuisance that something useful.
> 
> 
> On the contrary, I think that it is the cache which is an (unattractive) nuisance.
> 
> Like any cache, performance is only indirectly under your control. You cannot know for sure whether re.match(some_pattern, text) will be a cheap cache hit or an expensive re-compilation.

CPython is full of caches so, if that's what you worry about, your
problem is bigger than simply regex patterns.

(your CPU is full of caches too)

Regards

Antoine.





More information about the Python-ideas mailing list