[Python-ideas] re.compile_lazy - on first use compiled regexes

Chris Angelico rosuav at gmail.com
Sat Mar 23 14:15:30 CET 2013


On Sat, Mar 23, 2013 at 11:52 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> It would suffice to add pickle/marshal support for the
> compiled RE code. This could then be loaded from a string
> embedded in the module code on startup.
>
> E.g.
> # rx = re.compile('.*')
> rx = pickle.loads('asdfsadfasdf')

What would that do to versioning? Currently, as I understand it, the
compiled RE is a complete implementation detail; at any time, the re
module can change how it stores it. Pickles (again, as I understand it
- I may be wrong) should be readable on other versions of Python
(forward-compatibly, at least), on other architectures, etc, etc;
would this be a problem?

Alternatively, at the expense of some storage space, there could be
some kind of fallback. If the tag doesn't perfectly match the creating
Python's tag, it ignores the dumped version and just compiles it as
normal.

Hmm. Here's a mad thought - a bit of latticed casementing, if you
like. Could the compiled regexes be stored in the .pyc file? That
already has version tagging done. All it'd take is some sort of
extension mechanism that says "hey, here's some additional data that
the pyc might want to make use of". Or would that overly complicate
matters?

ChrisA



More information about the Python-ideas mailing list