[Python-Dev] New regex module for 3.2?
Gregory P. Smith
greg at krypto.org
Tue Jul 27 07:40:30 CEST 2010
On Thu, Jul 22, 2010 at 3:26 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Fri, Jul 23, 2010 at 12:42 AM, Georg Brandl <g.brandl at gmx.net> wrote:
> > Sure -- I don't think this is a showstopper for regex. However if we
> don't
> > include regex in a future version, we might think about increasing
> MAXCACHE
> > a bit, and maybe not clear the cache when it reaches its max length, but
> > rather remove another element.
>
> Yikes, I didn't know it did that. That certainly sounds like it should
> be an RFE in its own right - some basic form of Least Recently Used
> accounting should be beneficial (although the extra bookkeeping might
> hurt scripts that aren't hitting the cache limit).
>
>
A max cache size of 100 was too small. I just increased it to 500 in the
py3k branch along with implementing a random replacement cache overflow
policy. It now randomly drops 20% of the compiled regular expression cache
instead of simply dropping the entire cache on overflow.
With the regex_v8 benchmark, the better cache replacement policy sped it up
~7% while raising the cache size on top of that (likely meaning the cache
was never overflowing) sped it up ~25%.
Random replacement without dropping everything at least means apps thrashing
the cache degrade much more gracefully.
http://svn.python.org/view?view=rev&revision=83173
This change should be incorporated into MRAB's regex module in order to keep
comparisons fair.
-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100726/855ca57d/attachment-0001.html>
More information about the Python-Dev
mailing list