[pypy-dev] Annotating space status

Sun Jul 6 13:54:50 CEST 2003

Hello Samuele,

On Sat, Jul 05, 2003 at 03:36:41AM +0200, Samuele Pedroni wrote:
>        r = self.codetest("def f(a):\n"
>                           "    x = [1,2]\n"
>                           "    if a:\n"
>                           "        x.append(3)\n"
>                           "    else:\n"
>                           "        x.append(3)\n"
>                           "    return x",
>                          'f',[W_Anything()])
>         print r
> 
> this prints W_Constant([1, 2, 3, 3])

Aaargh. This is getting messy. Again, we really need to clarify what we want.

History: Originally, I was expecting the AnnotationObjectSpace to work
exclusively on RPython-compliant code. In this case it seems that we can
entierely avoid the problem of mutable objects. For example, the above code
(after translation)  would mean that we malloc() an array of two ints, then
realloc() it to make room for a third one. The actual values in the array
would never be part of a W_Xxx() wrapper. In other words W_Constant() would
only be used for constant immutable objects.

It is also the reason why I felt W_Anything() to be unnecessary: everything
*should* be known (this might require using something like
W_Union(W_Integer(), W_String()) at places). For this specific goal there is
no need to target Pyrex or any particularly clever run-time environment
because there is never anything more than ints and strings and structs being
manipulated -- no W_Anything().

Now it seems that we shifted towards the more general goal of analyzing *any*
Python code, reverting to W_Anything() if necessary. This is a cool goal too
but we should clarify which one we are heading for.

For reference the shift was caused by the app-helpers used by the interpreter.  
These were not meant to be written in any particularly restricted style, but
still, working our way through them is necessary -- for example, we need to
call decode_code_arguments() to follow where the arguments will end up, and we
must do this in some annotating object space because the arguments are
typically W_Integer() or W_String(), as opposed to real objects.

Proposal: Maybe we can decide we don't know yet exactly what we want, and just
special-case the functions in interpreter/*_app.py that we need for the
analyzis of simple (RPython) programs. For example, we can special-case
decode_code_arguments() by saying that whenever it is called (i.e. whenever
the AnnotationObjectSpace analyzes a non-trivial call) then:

 * we collect the W_Xxx() arguments in real tuples and dicts
 * we get the real code object
 * we just call decode_code_arguments(), which will manipulate these W_Xxx() 
objects instead of real objects, but it doesn't matter to it

If not enough information is known to prepare the arguments like that (for 
example, because we don't know the length of the argument tuple) then it is an 
error anyway: it is something we don't allow in RPython.

Drawback: We can only process RPython program. Well, that was the original
goal anyway. I think it is cleaner but it also means that it will take more
time before we can actually process the whole of PyPy. The other (current)  
solution is more like we have a very general but hacky W_Anything() fall-back,
that could quickly be complete enough to process arbitrary programs (provided
however that we don't keep running into these mutable object problems, which
is not clear to me).

Both goals are interesting per se, but my opinion is that we should
concentrate on the first one right now.

A bientôt,

Armin.