[Python-Dev] FAT Python (lack of) performance

Wed Jan 27 07:16:45 EST 2016

On 1/26/2016 12:51 PM, Stephen J. Turnbull wrote:
> Terry Reedy writes:
>   > On 1/26/2016 12:02 AM, INADA Naoki wrote:
>   >
>   > > People use same algorithm on every language when compares base language
>   > > performance [1].
>   >
>   > The python code is NOT using the same algorithm.  The proof is that the
>   > Python function will return the correct value for, say fib(50) while
>   > most if not all the other versions will not.

Let me try to be clearer.

1.  Like everyone else, I would like Python function calls to be faster 
either in general or in special cases detected during compilation.  This 
will require micro-benchmark for function calls that do just that. 
First time an empty loop, then time a loop with a call to an empty 
function.  Do the same for various signatures, and maybe other special 
cases.

2. Cross-language micro-benchmarks aimed at timing specific operations 
are tough. To run on multiple languages, they must be restricted to a 
lowest-common-denominator  of features.  It is impossible to make every 
implementation perform exactly the same set of operations.  Some 
languages may bundle features together.  Some languages and 
implementations have optimizations that avoid unneeded operations.  Not 
all optimizatons can be turned off.

3. While there are trends as to the speed of implementations of a 
language, benchmarks time particular implementations.  Shedskin would 
compile fib to a C function that runs much faster than fib with the 
restricted subset of CPython allowed for the benchmark.

> True, but that's not a reasonable criterion for "same algorithm" in
> this context.  Naoki's application ("base language performance"
> benchmarking) requires fib(n) only for n < 40, and run it in a loop
> 100 times if you want 2 more decimal places of precision ("40" is
> appropriate for an implementation with 32-bit ints).

So you agree that the limit of 39 is not intrinsic to the fib function 
or its uses, but is an after-the-fact limit imposed to mask the bug 
proneness of using substitutes for integers.

To my mind, a fairer and more useful benchmark of 'base language 
performance' based on fib would use a wider domain.  The report would 
say that CPython (with lru_cache disallowed) is slow but works over a 
wide range of inputs, while some other implementations of other 
languages run faster for small inputs but fail catastrophically for 
larger inputs.  Users could then make a more informed pick.

Also see my answer to Sven Kunze.

-- 
Terry Jan Reedy