On 23 March 2013 10:15, Chris Angelico email@example.com wrote:
On Sat, Mar 23, 2013 at 11:52 PM, M.-A. Lemburg firstname.lastname@example.org wrote:
It would suffice to add pickle/marshal support for the compiled RE code. This could then be loaded from a string embedded in the module code on startup.
E.g. # rx = re.compile('.*') rx = pickle.loads('asdfsadfasdf')
What would that do to versioning? Currently, as I understand it, the compiled RE is a complete implementation detail; at any time, the re module can change how it stores it. Pickles (again, as I understand it
- I may be wrong) should be readable on other versions of Python
(forward-compatibly, at least), on other architectures, etc, etc; would this be a problem?
Alternatively, at the expense of some storage space, there could be some kind of fallback. If the tag doesn't perfectly match the creating Python's tag, it ignores the dumped version and just compiles it as normal.
Pleas enote that compiled reg-expes can already be pickled straightforwardly.
Unfortunatelly, to avoid the version issues you mention, from overlooking the pickled string, it looks like it just calls "re.compile" with the original regex on unpickle - so there would be no gain from the implementation as is.
(I should stop being that lazy, and check what does unpickling a regexp actually does = Ah --Ezio found it while I was at it)
Hmm. Here's a mad thought - a bit of latticed casementing, if you like. Could the compiled regexes be stored in the .pyc file? That already has version tagging done. All it'd take is some sort of extension mechanism that says "hey, here's some additional data that the pyc might want to make use of". Or would that overly complicate matters?
I can't see how this could be achieved but for adding a special syntax that would compile reg-exps at parsing time. Then, we might as well use Perl instead :-)
But maybe some custom serializing could go straight into the sre_code that would proper serialize its objects as python-bytecode, and them some helper functions to load them from a custom made pyc file.
These pre-generated pycs would be built at Python build time.