Performance in exec environnements

Wed Jan 14 20:51:56 EST 2015

Jean-Baptiste Braun wrote:

> 2015-01-13 22:48 GMT+01:00 Steven D'Aprano <
> steve+comp.lang.python at pearwood.info>:
> 
>> So you have been comparing:
>>
>>     2
>>
>> versus
>>
>>     exec('1+1')
>>
>>
>> The first case just fetches a reference to a pre-existing int object, and
>> then deletes the reference. That's fast.
>>
>> The second case:
>>
>> - creates a new string '1+1'
>> - does a global lookup to find the built-in exec function
>> - passes the string to the function
>> - the function then parses that string and compiles it to byte-code
>> - runs the keyhole optimizer over it
>> - and finally executes the byte code for "2", same as above.
>>
>> Only the last step is the same as your earlier test case.
>>
> What I don't understand is the ratio between test 2 / 4 and test 1 / 3.
> 
> Let 0.0229 sec be the execution time to read a bytecode (1st test).
> Executing two times that bytecode takes 0.042 sec (test 3), which looks
> coherent.
> 
> Let 11.6 sec be the execution time to call the global exec, parse, do some
> stuff and read a bytecode (test 2). I'm trying to understand why does
> doing the same thing and reading one more bytecode is much longer (15.7
> sec) in comparison.

I don't know. What code are you comparing? If you are comparing these:

exec('1 + 1')
exec('1 + 1; 1 + 1')

the second one has twice as much source code to parse and compile.

You can eliminate much (but not all) of the overhead of exec by pulling the
compilation step out:

[steve at ando ~]$ python -m timeit "exec('1+1')"
100000 loops, best of 3: 18.5 usec per loop
[steve at ando ~]$ python -m timeit -s \
"c = compile('1+1', '', 'exec')" "exec(c)"
1000000 loops, best of 3: 1.49 usec per loop

Obviously this only helps if you are executing the same code repeatedly.

>> Bigger savings come from avoiding exec. Instead, try to use factory
>> functions, closures, etc. If you give an example of what you are trying
>> to generate with exec, we may be able to offer an alternative.
>>
> I think I'm going to compile before exec'ing.
> 
> What I'm trying to do is to map a transformation description in a markup
> langage (XSLT) in python to improve execution time. Here is a
> simplification of what it looks like :
> 
> XSLT :
> <title>
>  <xsl:choose>
>   <xsl:when test="gender='M'">
>    Mr
>   </xsl:when>
>   <xsl:otherwise>
>    Mrs
>   </xsl:otherwise>
>  </xsl:choose>
> </title>
> 
> Generated python :
> print('<title>')
> if gender == 'M':
>     print('Mr')
> else:
>     print('Mrs')
> print('</title>')

I'm not entirely sure of the context here. If you're only executing that "if
gender ==" block once, then trying to optimize this is a waste of time. The
time you take to pre-compile will be at least as much as the time you save
from a single call. Only if you have multiple calls will it help.

Ideally, instead of exec'ing a block of code, it would be better to create a
function and call the function:

# === Don't do this! ===
while parsing XSLT:
    if tag = 'choose':
        handle_title = """print('<title>')
             if gender == 'M': print('Mr')
             else:  print('Mrs')
             print('</title>')"""

# much later...
exec(handle_title)
print "Smith"
exec(handle_title)
print "Jones"

# === This is better ===
while parsing XSLT:
    if tag = 'choose':
        def handle_title():
             print('<title>')
             if gender == 'M': print('Mr')
             else:  print('Mrs')
             print('</title>')

# much later...
handle_title()
print "Smith"
handle_title()
print "Jones"

Understand that this is just a sketch of a solution, not an actual solution.
But the important thing is that if you can build an actual function rather
than exec'ing source code, that will be much faster. If the XSLT file only
has a small number of functions such as "choose", you may be able to use
factory functions to build the functions and avoid exec.

And if not... oh well. There is nothing wrong with using exec. True, it is a
little slower, but how much data do you have to process? Engaging in
unnecessary optimization to shave the run time from 20 minutes to 19
minutes is probably a waste of effort.

-- 
Steven