[Python-ideas] No need to add a regex pattern literal

M.-A. Lemburg mal at egenix.com
Mon Dec 31 06:31:06 EST 2018


On 31.12.2018 12:23, Antoine Pitrou wrote:
> On Thu, 27 Dec 2018 19:48:40 +0800
> Ma Lin <malincns at 163.com> wrote:
>> We can use this literal to represent a compiled pattern, for example:
>>
>>  >>> p"(?i)[a-z]".findall("a1B2c3")  
>> ['a', 'B', 'c']
>>
>>  >>> compiled = p"(?<=abc)def"
>>  >>> m = compiled.search('abcdef')
>>  >>> m.group(0)  
>> 'def'
>>
>>  >>> rp'\W+'.split('Words, words, words.')  
>> ['Words', 'words', 'words', '']
>>
>> This allows peephole optimizer to store compiled pattern in .pyc file, 
>> we can get performance optimization like replacing constant set by 
>> frozenset in .pyc file.
>>
>> Then such issue [1] can be solved perfectly.
>> [1] Optimize base64.b16decode to use compiled regex
>> [1] https://bugs.python.org/issue35559
> 
> The simple solution to the perceived performance problem (not sure how
> much of a problem it is in real life) is to have a stdlib function that
> lazily-compiles a regex (*). Just like "re.compile", but lazy: you don't
> bear the cost of compiling when simply importing the module, but once
> the pattern is compiled, there is no overhead for looking up a global
> cache dict.
> 
> No need for a dedicated literal.
> 
> (*) Let's call it "re.pattern", for example.

No need for a new function :-)

We already have re.search() and re.match() which deal with compilation
on-the-fly and caching. Perhaps the documentation should hint at this
more explicitly...

https://docs.python.org/3.7/library/re.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Dec 31 2018)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
                      http://www.malemburg.com/



More information about the Python-ideas mailing list