On Thu, Jul 22, 2010 at 9:34 PM, Georg Brandl <g.brandl@gmx.net> wrote:
So, I thought there wasn't a difference in performance for this use case (which is compiling a lot of regexes and matching most of them only a few times in comparison). However, I found that looking at the regex caching is very important in this case: re._MAXCACHE is by default set to 100, and regex._MAXCACHE to 1024. When I set re._MAXCACHE to 1024 before running the test suite, I get times around 18 (!) seconds for re.
That still fits with the compile/match performance trade-off changes between re and regex though. It does make it clear this isn't going to be a win across the board though - things like test suites are going to have more one-off regex operations than a long-running web server or a filesystem or database scanning operation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia