[Python-ideas] No need to add a regex pattern literal

Ma Lin malincns at 163.com
Mon Dec 31 08:02:56 EST 2018


On 18-12-31 19:47, Antoine Pitrou wrote:
 > The complaint is that the global cache is still too costly.
 > See measurements in https://bugs.python.org/issue35559

In this issue, using a global variable `_has_non_base16_digits` [1] will 
accelerate 30%.
Is re module's internal cache [2] so bad?

If rewrite re module's cache with C and use a custom data structure, 
maybe we will get a small speedup.

[1] `_has_non_base16_digits` in PR11287
[1] https://github.com/python/cpython/pull/11287/files

[2] re module's internal cache code:
[2] https://github.com/python/cpython/blob/master/Lib/re.py#L268-L295

_cache = {}  # ordered!
_MAXCACHE = 512
def _compile(pattern, flags):
     # internal: compile pattern
     if isinstance(flags, RegexFlag):
         flags = flags.value
     try:
         return _cache[type(pattern), pattern, flags]
     except KeyError:
         pass
     ...



More information about the Python-ideas mailing list