Gregory P. Smith, 24.03.2013 00:48:
On Sat, Mar 23, 2013 at 4:34 PM, Bruce Leban wrote:
On Sat, Mar 23, 2013 at 4:14 PM, Gregory P. Smith wrote:
keep=True defeats the purpose of a caching strategy. An re.compile call within some code somewhere is typically not in a position to know if it is going to be called a lot.
I think the code, as things are now, with dynamic construction at runtime based on a simple test is the best of both worlds to avoid the more complicated cost of calling re.compile and going through its cache logic. If the caching is ever is improved in the future to be faster, the code can arguably be simplified to use re.search or re.match directly and rely solely on the caching.
ie: don't change anything.
Truth is people are currently doing caching themselves, by compiling and then keeping the compiled regex. Saying they're not in a position to know whether or not to do that isn't going to change that. Is it worthwhile having the regex library facilitate this manual caching?
In the absense of profiling numbers showing otherwise, i'd rather see all forms of manual caching like the conditional checks or a keep=True go away as it's dirty and encourages premature "optimization".
If I had been "more aware" of the re internal cache during the last years, I would have avoided at least a couple of re.compile() calls in my code, I guess.
Maybe this is something that the documentation of re.compile() can help with, by telling people explicitly that this apparently cool feature of pre-compiling actually has a drawback in it (startup time + a bit of memory usage) and that they won't notice a runtime difference in most cases anyway.