[pypy-dev] Re: [ann] Minimal Python project

Christian Tismer tismer at tismer.com
Thu Jan 16 13:26:33 CET 2003

Paul Rubin wrote:
> Christian Tismer <tismer at tismer.com> writes:
>>This leads to high-level descriptions of low-level
>>fields of all structures.

> Do you have some other examples that really get used often?  I'd think
> f_lineno's don't get used that often.

This was just an example for a common field (in
fact it is used quite often at runtime, but almost
only in the SET_LINENO opcode), that is embedded
into some other structure.
Everything else would to.
The idea is to describe fixed data structures
in a way that allows Python to easily deduce
what the data type is, without any trial.
See for example Thomas Heller's ctypes module,
which has such descriptions for everything.

A drawback of this is that we always need to
access target variables in dotted notation.
Otherwise we loose the static type info.
If Python had type declarations, this would
be no problem.

Here an example from intobject.c:

static int
int_compare(PyIntObject *v, PyIntObject *w)
	register long i = v->ob_ival;
	register long j = w->ob_ival;
	return (i < j) ? -1 : (i > j) ? 1 : 0;

Assuming that we try to model this in Python,
the resulting code might look like this

def int_compare(v, w):
      i, j = v.ob_ival, w.ob_ival
      if i < j: return -1
      elif i > j: return 1
      return 0

The above code only occours in contexts where
integer objects are passed in. There is no
type info in advance, but at the first execution
of this code, v and w are passed in with their
descriptions of all fields, and it is now
absolutely clear that i and j are values of
fixed sized integers.
Code for directly accessing the ob_ival fields
and doing the comparison can immediately
be emitted when running the code first time.

A remaining problem with the lack of declarations
are local variables which are not members of
a structure and it is not clear from the beginning
what the primitive type should be.
One ugly way would be to construct a structure
for the local variables and to use dotted notation
again. I hope this can be avoided by propagation
of type info via __coerce__.

Another snipped from intobject:

		for (i = 0, p = &list->objects[0];
		     i < N_INTOBJECTS;
		     i++, p++) {
			if (PyInt_CheckExact(p) && p->ob_refcnt != 0)

Given a type object named c_int, this might translate to

      i = c_integer(0)
      p = list.objects
      while i < N_INTOBJECTS:
          # body not implemented here
          i += 1
          p += 1

Here I use a known class as initialization of i.
The data type is therefore known.
p as a pointer field in the list structure is
also known.
The __coerce__ method of these classes can be
written in a way, that they always propagate
their own class to other operands, and especially
in this case, the right operand is a constant.
Given a definition like this:

class c_integer(c_datatypes):
      def __coerce__(self, other):
          if type(other) == int:
              return self, c_integer(other)
          elif ....

What I tried to express is that with little or
no help of the programmer, primitive data types
can be deduced quite easily, and unique code
can be emitted on first execution of the code.

> Will you give some thought to going to a tagged representation of
> object references, and maybe a more traditional GC?  That way you
> avoid a lot of memory traffic, and also don't have to adjust reference
> counts every time you touch anything.
> I think these would give a big performance boost, and also would also
> simplify the C API.  It might be possible to supply a compatibility
> layer to make the old C API still useable.

Can you give me a hint how this would be done?
I have no experience with tagged representations,
but if this can save so much, I should learn more
about it.

cheers - chris

Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
       whom do you want to sponsor today?   http://www.stackless.com/

More information about the Pypy-dev mailing list