On 23 March 2013 11:09, Joao S. O. Bueno
On 23 March 2013 10:15, Chris Angelico
wrote: On Sat, Mar 23, 2013 at 11:52 PM, M.-A. Lemburg
wrote: It would suffice to add pickle/marshal support for the compiled RE code. This could then be loaded from a string embedded in the module code on startup.
E.g. # rx = re.compile('.*') rx = pickle.loads('asdfsadfasdf')
What would that do to versioning? Currently, as I understand it, the compiled RE is a complete implementation detail; at any time, the re module can change how it stores it. Pickles (again, as I understand it - I may be wrong) should be readable on other versions of Python (forward-compatibly, at least), on other architectures, etc, etc; would this be a problem?
Alternatively, at the expense of some storage space, there could be some kind of fallback. If the tag doesn't perfectly match the creating Python's tag, it ignores the dumped version and just compiles it as normal.
Pleas enote that compiled reg-expes can already be pickled straightforwardly.
Unfortunatelly, to avoid the version issues you mention, from overlooking the pickled string, it looks like it just calls "re.compile" with the original regex on unpickle - so there would be no gain from the implementation as is.
(I should stop being that lazy, and check what does unpickling a regexp actually does = Ah --Ezio found it while I was at it)
There it is, straight in re.py: import copyreg def _pickle(p): return _compile, (p.pattern, p.flags) copyreg.pickle(_pattern_type, _pickle, _compile) So, pickling regexps as they are now are definitely no speed-up.