[pypy-dev] Re: [ann] Minimal Python project

Christian Tismer tismer at tismer.com
Thu Jan 16 18:14:23 CET 2003


Samuele Pedroni wrote:
> From: "Christian Tismer" <tismer at tismer.com>
[tagged object references]

>>Can you give me a hint how this would be done?
>>I have no experience with tagged representations,
>>but if this can save so much, I should learn more
>>about it.
> 
> this is a relevant survey paper:
> 
> http://citeseer.nj.nec.com/gudeman93representing.html
> 
> Representing Type Information in Dynamically Typed Languages

Oh, well, sorry, now I remember.
Sure, I also remember that I was thinking
loud about boxing variables and tagging addresses
a few years ago. Can't find it any longer, but
as I remember, somebody was not convinced that
tagging would be great for Python. Maybe TimBot?

I know a couple fo interpreters which use tagging,
and Guile for instance makes extensive use of it.
This technique is kind of data compression by
putting type info into an unused pattern of addresses.
It can avoid lots of allocations, but also turns
every object access into some macro processing.
for a language that I have to code by hand, I don't
like that so much.
But when we have control over everything, like in
this project, it makes at least sense to try an
alternative object model, by the newly gathered
flexibility.

For the bootstrap phase, I'd say hands off from
any changes of the model. First we redefine
the world identically "in place". When that works,
we can try changing worlds.

> integer in the range [-2**30,2**30-1] would be represented by their bit
> patterns shifted left by 1 bit
> 
> all other objects would be boxed, and the address left shifted by 1 and ORed
> with 1.

Humm. I would use the left shifted integer, or'ed with 1.
After testing that bit, Arithmetic right shift gives
the integer.
That allows to use the address variant without any
operation and has the nice debugging facility that
addresses are already addresses, and object inspection
works directly.

...

> Another approach is to carry data around as a struct where both the tag and the
> datum are machine words, this approach is AFAIK currently used in the Gwydion
> Dylan compiler (www.gwydiondylan.org) originally developed at CMU.

This makes it impossible to get an object as a result
value of a function. You need references all the time.
Also, container objects will double their size.
I think, both can be tried, later, by changing only
a little Python code that is responsible for the
representation of this.
Getting completely rid of reference counting in favor
of a classical GC is also a challenging path which
cannot be tried with CPython at all.

The best layout will win the game. I don't know in
advance, so let's create the thing flexible enough
to keep all paths open.

cheers - chris

p.s.: Paul, would you mind to participate in pypy-dev,
then we can avoid crossposting.

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/



More information about the Pypy-dev mailing list