[pypy-dev] Questions for Armin
Armin Rigo
arigo at tunes.org
Sat Jan 18 20:00:53 CET 2003
Hello Edward,
On Fri, Jan 17, 2003 at 01:08:28PM -0600, Edward K. Ream wrote:
> 1. How often and under what circumstances does psyco_compatible get called?
>
> My _guess_ is that it gets called once per every invocation of every
> "psycotic" function (function optimized by psyco). Is this correct?
No: psyco_compatible() is only called at compile-time.
When a "psycotic" function is called by regular Python code, we just jump to
machine code that starts at the beginning of the function with no particular
assumption about the arguments; it just receive PyObject* pointers. Only when
something more about a given argument is needed (say its type) will this
extra information be asked for. The corresponding machine code is very fast
in the common case: it loads the type, compares it with the most common type
found at this place, and if it matches, runs on. So in the common case, we
only have one type check per needed argument. Given
def my_function(a,b,c):
return a+b+c
the emitted machine code looks like what you would obtain by compiling this:
PyObject* my_function(PyObject* a, PyObject* b, PyObject* c)
{
int r1, r2, r3;
if (a->ob_type != &PyInt_Type) goto uncommon_case;
if (b->ob_type != &PyInt_Type) goto uncommon_case;
if (c->ob_type != &PyInt_Type) goto uncommon_case;
r1 = ((PyIntObject*) a)->ob_ival;
r2 = ((PyIntObject*) b)->ob_ival;
r3 = ((PyIntObject*) c)->ob_ival;
return PyInt_FromLong(r1+r2+r3);
}
Only when a new, not-already-seen type appears does it follow the
"uncommon_case" branch. This triggers more compilation, i.e. emission of more
machine code. During this emission, we make numerous calls to
psyco_compatible() to see if we have reached a state that we have already
seen, and which subsequently corresponds to already-emitted machine code; if
it does, we emit a jump to this old code. This is the purpose of
psyco_compatible().
I must mention that in the above example, the nice-looking C version is only
arrived at after several steps of execution mixed with further
compilation. The first version is:
PyObject* my_function(PyObject* a, PyObject* b, PyObject* c)
{
goto uncommon_case; /* need a->ob_type */
}
Then when the function is first called with a integer in 'a', it becomes:
PyObject* my_function(PyObject* a, PyObject* b, PyObject* c)
{
if (a->ob_type != &PyInt_Type) goto uncommon_case;
goto uncommon_case; /* need b->ob_type */
}
and so on.
> 2. True or false: the call to psyco_compatible would be equivalent to
> runtime code that discovers special values of certain particular variables.
See above.
> 3. True or false: adding more state information to psyco (in order to
> discover more runtime values) will slow down psyco_compatible.
This is true. The more you run-time values you want to "discover" (I say
"promote to compile-time"), the more versions of the same code you will get,
and the slower psyco_compatible() will be (further slowing down compilation,
but not execution proper, as seen above).
> 4. Are these the most important questions to ask about psyco? If not, what
> _are_ the key questions?
Hard to say! I like to mention the "lazy" values ("virtual-time"). These are
the key to high-level optimizations in Psyco. In the above example you might
have noticed that the Python interpreter must build and free an intermediate
integer object for "a+b" when computing "a+b+c", while the C version I
showed does not. Psyco does this by considering the intermediate PyObject*
pointer as lazy. As long as it is not needed, no call to PyInt_FromLong() is
written; only the value "r1+r2" is computed. Similarily, in "a+b", if both
operands are strings, the result is a lazy string which is implemented as a
lazy list "[a,b]". Concatenating more strings turns the list into a real
Python list, but the resulting string itself is still lazy. This is how Psyco
end up automatically translating things like
s = ''
for t in xxx:
s += t
into something like
lst = []
for t in xxx:
lst.append(t)
s = ''.join(lst)
I hope that these examples cast some light on Psyco. I realize that this
could distract people from the current goals of this project, and I apologize
for that. We should discuss e.g. "how restricted" the language we use for
Python-in-Python should be...
A bientot,
Armin.
More information about the Pypy-dev
mailing list