Mailman 3 Ideas and Opinions - pypy-dev

June 19, 2003

      In the moment my pypy understanding is at the Novice Level
and all my comments are a bit on the surface "Hey Prof! You forgot a closing paranthese".

So I try to browse ideas and opinions:

1)PyPy is not only about Python. Its about compiling and interpreting - translating and executing code.
This two approaches are necessary for every programming language. Translating, because a programming 
language is designed for humans. Executing, because the behaviour of a programm is not decidable.
So even compiled languages are eventually executed on a real machine 
(maybe with some intermediate translation/execution  on virtual machines)

At compile time we try to collect all informations which are necessary in the target environment/execution context.
For example type information in static typed languages, because the target language (assembler) makes a differece 
in adding ints or floats (as I remember from Armins lecture last year)

In Python progamms is not enough type information at compile time, so CPython´s target is not Assembler on a CPU, but 
Python bytecode on a Python virtual machine. CPython interprets this bytecode at runtime with some opcode machinery
PyPy does an abstract interpretation in several ObjectSpaces

Usually execution and translation work on different data structures at different times. 
Translation works at compile time at the static data structure of an abstract syntax tree (given by the parser).
Execution works at runtime on the dynamic sequence of instructions (given by the interpration of bytecode) 
which is the input for abstract interpretation.

If we want to do some magic translation/execution entanglement we 
have to think about doing the right thing at the right time (several times of execution and translation)
and about mappings between our dynamic and static data structures.

For example in interpretation code we enter a loop which we want to optimize.
We can´t do this earlier without receiving a big memory overhead, because earlier we havent 
got the necessary type informations.

So we follow the mapping from our instruction to the the abstract syntax tree, set type labels at some nodes and 
do some local translation and optimization. As a result we receive (local?) changed bytecode for the body of our loop. Give it
to the interpreter and execute it.

This hot spot approach will give us the freedom not to calulate all execution pathes of a programm, 
but only the actual necessary ones. Another approach to filter only important pathes could be in combination with a
test framework. Execute all your tests and cache the dynamic generated code in a database. When your
programm is still slow, you forgot maybe some tests.

2) At a first glance, Pyrex seems to me a beauty example for a concrete syntax which is not in the way of 
the programmer. (of course its more than that). Pyrex can be annotated with C-style static informations. 
But it is not necessary to do this. You can program without annotations. Then its just Python. If you
use annotations, Pyrex generates the C-source for extension types

------example---------
def primes(int kmax):
  cdef int n, k, i
  cdef int p[1000]
  result = []
  if kmax > 1000:
    kmax = 1000
  k = 0
  n = 2
  while k < kmax:
    i = 0
    while i < k and n % p[i] <> 0:
      i = i + 1
    if i == k:
      p[k] = n
      k = k + 1
      result.append(n)
    n = n + 1
  return result
---------------------

If we add some concrete syntax in pypy, we will not only optimize the language core. 
We will give the Application programmer the opportunity to choose her codestyle in a scale from dynamic to static.

For example, all modules of the Python Standard Library should be written as static as possible.

Günter

P.S: comments and clearing of misunderstandings are welcome

Ideas and Opinions

Günter Jantzen

Christian Tismer

Günter Jantzen

tags

participants (2)