[Python-ideas] re.compile_lazy - on first use compiled regexes

Chris Angelico rosuav at gmail.com
Sat Mar 23 15:25:15 CET 2013

On Sun, Mar 24, 2013 at 1:09 AM, Joao S. O. Bueno <jsbueno at python.org.br> wrote:
>> Hmm. Here's a mad thought - a bit of latticed casementing, if you
>> like. Could the compiled regexes be stored in the .pyc file? That
>> already has version tagging done. All it'd take is some sort of
>> extension mechanism that says "hey, here's some additional data that
>> the pyc might want to make use of". Or would that overly complicate
>> matters?
> I can't see how this could be achieved but for adding a special
> syntax that would compile reg-exps at parsing time.  Then, we might as well use
> Perl instead :-)

Yeah, that's the most obvious form - some kind of regex literal
syntax. I was thinking, though, that there might be some sort of
extension to the pyc format that lets any module add precompiled data
to it; the trouble would then be figuring out how to recognize what
ought to get dumped into the pyc. It'd effectively need to be
something that gets added to the code like:

foo = re.compile('fo+')
bar = re.compile('ba+r')

That could then pre-populate some kind of cache that gets loaded with
the pyc, and then when re.compile() gets a particular string, it looks
it up in the cache and finds the precompiled version.

Of course, this would quite possibly be more effort than it's worth.
Complicating the pyc format in this way needs a lot of justification.


More information about the Python-ideas mailing list