<br><div class="gmail_quote">On Thu, Jul 22, 2010 at 3:26 PM, Nick Coghlan <span dir="ltr"><<a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">On Fri, Jul 23, 2010 at 12:42 AM, Georg Brandl <<a href="mailto:g.brandl@gmx.net">g.brandl@gmx.net</a>> wrote:<br>
> Sure -- I don't think this is a showstopper for regex. However if we don't<br>
> include regex in a future version, we might think about increasing MAXCACHE<br>
> a bit, and maybe not clear the cache when it reaches its max length, but<br>
> rather remove another element.<br>
<br>
</div>Yikes, I didn't know it did that. That certainly sounds like it should<br>
be an RFE in its own right - some basic form of Least Recently Used<br>
accounting should be beneficial (although the extra bookkeeping might<br>
hurt scripts that aren't hitting the cache limit).<br>
<div class="im"><br></div></blockquote><div><br></div><div>A max cache size of 100 was too small. I just increased it to 500 in the py3k branch along with implementing a random replacement cache overflow policy. It now randomly drops 20% of the compiled regular expression cache instead of simply dropping the entire cache on overflow.</div>
<div><br></div><div>With the regex_v8 benchmark, the better cache replacement policy sped it up ~7% while raising the cache size on top of that (likely meaning the cache was never overflowing) sped it up ~25%.</div><div>
<br>
</div><div>Random replacement without dropping everything at least means apps thrashing the cache degrade much more gracefully.</div><div><br></div><div> <a href="http://svn.python.org/view?view=rev&revision=83173">http://svn.python.org/view?view=rev&revision=83173</a></div>
<div><br></div><div>This change should be incorporated into MRAB's regex module in order to keep comparisons fair.</div><div><br></div><div>-gps</div><div><br></div></div>