[pypy-dev] Contributing to pypy [especially numpy]

Sun Oct 16 20:31:53 CEST 2011

Maciej Fijalkowski, 16.10.2011 20:01:
> On Sun, Oct 16, 2011 at 6:34 PM, Stefan Behnel wrote:
>> Maciej Fijalkowski, 16.10.2011 17:50:
>>> We have proven
>>> already that we can perform several optimizations that are very hard
>>> to perform at the C level. And indeed, while you can always argue
>>> "well, you can just write a better compiler", it's true also for JITs.
>>
>> I wasn't comparing a JIT to another compiler. I was comparing it to a human
>> programmer. A JIT, just like any other compiler, will never be able to
>> *understand* the code it compiles, and it can only apply the optimisations
>> that it was taught. JITs are nice when you need performance quickly and
>> don't care about the last few CPU cycles. However, there are cases where
>> it's not very satisfactory to learn that your JIT compiler, in the current
>> state that it has, can only get you up to, say, 90%, or even 95% of the
>> speed that you need for your problem. In those cases where you do care about
>> the last 5%, and numerics people care about them surprisingly often, you
>> will eventually end up using a low-level language, usually C or Fortran, to
>> make sure you get as much out of your code as possible. JIT compilers are
>> structurally much harder to manually override than static compilers, and
>> they are certainly not designed to help with the "but I know what I'm doing"
>> cases.
>
> I just claim you're wrong here and there are cases where you can't
> beat the JIT compiler, precisely because some stuff depends on runtime
> data and you can't encode all the possibilities in a statically
> compiled code (at least in theory).

Regarding David's response, I agree that there are cases where JITs can 
help in limiting the code explosion that you'd get from statically 
generating all possible optimised cases for generic code. A JIT only needs 
to instantiate the cases that really exist at runtime. Obviously, that does 
not automatically mean that the JIT would generate code that is as fast or 
faster than what a programmer would write for *one* of the specific cases 
by tuning the code accordingly. It just means that it would generate better 
code *on average* when looking at the whole corpus, because it can simply 
(and quickly) adapt the code to more cases at need. If that's what the 
programmer wants depends on the use case. I see the advantage for that 
especially in library code that needs to deal with basically all cases 
efficiently, as David pointed out.

Stefan