Python speed and `pcre'

François Pinard pinard at iro.umontreal.ca
Tue Aug 31 15:55:36 CEST 1999


Hi, people.  Still learning and experimenting with Python. :-)

After having translated some code (not big, but not small) from Perl to
Python, I discover it runs ten times slower.  I did not learn to profile
yet, and I am not sure it would help, after having read that profiling
gives call counts, but no elapsed time.  Is that right?

My intuition tells me that the amount of regular expression matching might
provide an explanation.  Here is my set of _hypotheses_ (I'm not sure):

* Perl keeps compiled all /REGEXP/ not using string interpolation,
* Python cache for compiled REGEXP is less virtuous than I imagined.

I would be tempted to try using, in Python, a trick like `gettext' does
to avoid retranslating in already translated message.  That is, for _each_
textual `gettext' call, the compiler #define extends some code statically
allocating of a local variable used to cache the result or the call,
and use this cached result afterwards.

My problem, and I hope you will have some good advice to give for it:-),
is that I do not see how to this legibly.  I could regroup a lot of
`re.compile' calls and use the corresponding variable afterwards, but I
would much rather keep the REGEXP expressions textual near the place they
are needed, and have some:

        global gensymed-local
        if gensymed-local is None:
	    gensymed-local = re.compile(REGEXP)
        gensymed-local.sub(...)

This would have the advantage of keeping the REGEXP where it is meaningful
in the code.  Yet, creating and maintaining all those `gensymed-local'
variables by hand might be fairly tedious, not to say some code to ensure
they are all None.  (But maybe `global' already guarantees that?  If yes,
may I count on it, that is, does Python promises it is unlikely to change?)

Of course, seeing the above clutter, the idea a macro-generator is surely
tempting, but this would be fairly heavy resorting to external mechanics
for something which should be straightforward.  If Python included a
macro-generator, it would be a nice thing...

Is there something fundamental I am missing, and some just simpler code
that would have the desired effect?  Or else, is there another approach
that could give me more speed without giving too much on legibility?
I'm surely ready to accept some slowdown going from Perl to Python, but
a slowdown factor or 10 is a bit much, in my opinion.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard





More information about the Python-list mailing list