[Python-ideas] No need to add a regex pattern literal
M.-A. Lemburg
mal at egenix.com
Mon Dec 31 06:31:06 EST 2018
On 31.12.2018 12:23, Antoine Pitrou wrote:
> On Thu, 27 Dec 2018 19:48:40 +0800
> Ma Lin <malincns at 163.com> wrote:
>> We can use this literal to represent a compiled pattern, for example:
>>
>> >>> p"(?i)[a-z]".findall("a1B2c3")
>> ['a', 'B', 'c']
>>
>> >>> compiled = p"(?<=abc)def"
>> >>> m = compiled.search('abcdef')
>> >>> m.group(0)
>> 'def'
>>
>> >>> rp'\W+'.split('Words, words, words.')
>> ['Words', 'words', 'words', '']
>>
>> This allows peephole optimizer to store compiled pattern in .pyc file,
>> we can get performance optimization like replacing constant set by
>> frozenset in .pyc file.
>>
>> Then such issue [1] can be solved perfectly.
>> [1] Optimize base64.b16decode to use compiled regex
>> [1] https://bugs.python.org/issue35559
>
> The simple solution to the perceived performance problem (not sure how
> much of a problem it is in real life) is to have a stdlib function that
> lazily-compiles a regex (*). Just like "re.compile", but lazy: you don't
> bear the cost of compiling when simply importing the module, but once
> the pattern is compiled, there is no overhead for looking up a global
> cache dict.
>
> No need for a dedicated literal.
>
> (*) Let's call it "re.pattern", for example.
No need for a new function :-)
We already have re.search() and re.match() which deal with compilation
on-the-fly and caching. Perhaps the documentation should hint at this
more explicitly...
https://docs.python.org/3.7/library/re.html
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Dec 31 2018)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/
More information about the Python-ideas
mailing list