[pypy-dev] PyPy to generate C/C++ code

Wed Sep 15 01:11:28 CEST 2010

2010/9/15 Saravanan Shanmugham <sarvi at yahoo.com>:
> I don't expect this python compiler to be for full python but just a Restricted
> statically typed subset of python as defined by Shedskin.
>
> Yes. JIT annotation may not serve the purpose of generating a compiler.
> Hence the porting of the type inference engine and may be use JIT notations if
> it can be.

I've downloaded and read the source code of shedskin.
>From what I understand, here are some differences between PyPy and Shedksin.

- Shedskin analyses and generates code directly by walking the AST of a python
module.  (there are two passes: the first to grab information about global types
and functions, the second to emit code)

- Shedskin does very little type inference. Shedskin's type system is based on
C++ templates, and once a variable's type has been determined, generic code is
emitted and the C++ compiler will select the correct implementation.  Other
inference engines also work on the AST; Logilab's pylint, for example, works
much harder to check all instructions and the type of all variables.  Shedskin
does not seem to need such power.

- On the other hand, PyPy analyzes imported modules, and works on the bytecode
of functions living in memory.  It does a complete type inference and emits
low-level C code or Java intermediate representation.

- PyPy has its own way to write generic code and templates, the language for
meta-programming is Python itself!  [I'm referring to loops that
generate classes and functions, and things like "specialize:argtype(0)",
"unrolling_iterable" combined with constant propagation].

In most cases, PyPy does not generate better code than Shedskin. When Shedskin
compiles code, it does it well.  And its restrictions are easier to work with;
RPython is really tricky to get right sometimes.

Of course, PyPy goal is different: it does not only generate low-level C code,
it also generates a JIT compiler that can optimize calls at runtime - in the
context of an interpreter.  I can't see which computations made there could be
applied to static code.

Bottom line: if you want to generate efficient C code from python, use (and
improve) Shedskin.  If you want python code to run faster, don't translate
anything, and use the PyPy interpreter.

-- 
Amaury Forgeot d'Arc