[Python-Dev] proposal+patch: sys.gettickeraccumulation()

Sun Nov 14 20:18:14 CET 2004

Nick Coghlan wrote:

> Ralf W. Grosse-Kunstleve wrote:
> >    A pure Python program will spent the bulk of the time interpreting
> >    bytecode.
> 
> Perhaps, perhaps not.

Right. But remember what the actual goal is: we want to answer the
question "Is it worth reimplementing a piece currently written
in Python in C/C++?"

> A heck of a lot of the Python core isn't written 
> in Python - time spent executing builtins, or running methods of builtin 
> objects usually doesn't involve the main interpreter loop (we're 
> generally in pure C-code at that point).

If a piece of Python code leads to heavy use of complex, time-consuming
builtin operations it will be of less benefit to reimplement that code
in C/C++. This is exactly what we want to learn.

> I'm curious how the suggested feature can provide any information that 
> is actually useful for optimisation purposes. Just because a low 
> proportion of time is spent in Python code, doesn't mean the Python code 
> isn't at fault for poor performance.
> 
> As an example, in CPython 2.3 and earlier, this:
> 
>    result = ""
>    for x in strings:
>      result += x
> 
> is a lot worse performance-wise than:
> 
>    result = "".join(strings)
> 
> The first version does spend more time in Python code, but the 
> performance killer is actually in the string concatenation C code. So 
> the time is spent in the C code, but the fault lies in the Python code 
> (In Python 2.4, the latter version is still faster, but the difference 
> isn't as dramatic as it used to be).

Exactly. If you try out my patch and look at time/ticks you will see
immediately that there is no point in reimplementing "".join(strings)
in C/C++. Importantly, you don't have to look at the code to arrive at
this conclusion. The time/tick alone will tell you. This is very
helpful if you are working with third-party code.

> Knowing "I'm spending x% of the time executing Python code" just isn't 
> really all that interesting,

Right. Sorry if I gave the wrong impression that this could be
interesting. It is indeed not. What is interesting is the estimated
benefit of reimplementing a piece of Python in C/C++. This is in
fact highly correlated with the time/tick.

> I'd rather encourage people to write appropriate benchmark scripts and 
> execute them using "python -m profile <benchmark> ",

This approach is impractical/impossible in the real world.
For example, this is the problem prompting me to implement
sys.gettickeraccumulation():

  http://pyquante.sourceforge.net/

- It is not our code, i.e. it is difficult for us to know where
  the time is spent.
- It makes heavy use of Numeric.
- It has a few innermost loops implemented in C.

We are using only some parts of this library.
Question: if we reimplement these parts completely in C++, what speedup
can we expect?
So we run a whole calculation and print the time/tick, which you can do
with less than a one-minute investment:

  print time.time()/sys.gettickeraccumulation()*1.e6

as the last statement of your code. If the printed value is close 0.15
on our reference platform we know that the speedup will be in the
neighborhood of 100. Any value higher than 0.15 indicates that the
expected speedup will less. In our case the value was 0.35, and after
we did the reimplementation in C++ we found a speedup of about 10. We
have other applications with time/tick around 10. Just looking at this
number tells us that there is not much to gain unless we completely
eliminate Python. Bring in the cost for the C++ programmer and the
increased cost of maintaining the C++ code compared to Python, and you
know what we decided (not) to do.

> rather than lead 
> them up the garden path with a global "Python/non-Python" percentage 
> estimation utility.

Please consider that that utility is simply printing
time.time()/sys.gettickeraccumulation(), that my patch is trivial, and
that the runtime penalty is close to non-existing.

Cheers,
        Ralf