[pypy-dev] Brainstorming at Python UK

Armin Rigo arigo at tunes.org
Wed Apr 28 14:11:21 CEST 2004


Hello Holger,

On Wed, Apr 28, 2004 at 11:45:08AM +0200, holger krekel wrote:
> Maybe it makes sense to talk about low-level information/representation
> objects rather than "abstract objects" which isn't a very expressive term
> IMO.

Yes, though they are not really representations yet.  "Representation" objects
would be more dependent on the code generator.  They are really "types" in the
theoretical sense: sets of values.  We could say "subset objects" and use the
s_ prefix for SomeXxx instances.

> Ok, i can see this working for variables containing "naturally"
> immutable objects like integers, strings and numbers. But how does the
> example apply to building a list in a loop?  I am a bit doubtful about a
> "virtually immutable" SomeList object unless you intend to use a low-level
> representation like e.g.: 
> 
>     class SomeGrowingList:
>      self.r_items = someitem    # a SomeXxx instance general enough to 
>                                 # hold all possible items of the list 
>      self.r_indexes = SomeInteger(0, sys.maxint-1) 

There are two different issues here.  One is how to make sure that the
analysis eventually terminates; a problem would be if a subset object is
replaced by a gradually more general one all the time, endlessly.  The other
is mutability.

For an example of the former: SomeInteger(0,1) -> SomeInteger(0,2) ->
SomeInteger(0,3) -> SomeInteger(0,4) -> ...  To prevent that, we could use
more simple subset objects that don't have this problem, e.g. SomeIntegers
that can only be either "known-to-be-positive" or not.  For lists we can just
ignore the index and assume that the list has an unknown length, so that
SomeList only has to deal with a 's_item' attribute general enough for all
items; if 's_item' cannot be more and more general ad infinitum, then SomeList
instances cannot either.  If we try to record too much about the length, we
could get SomeList(length=0) -> SomeList(length=0 or 1) -> SomeList(length=0
or 1 or 2) -> ...

The other issue is that of mutable objects (lists or instances).  The idea of
recording the creation points and re-analysing from there if necessary applies
to lists as well as instances:

  Block1:
    v1 = list()

This creates a list of "impossible" elements -- i.e. a list whose elements are
values in the empty set, which is the less general set.

  cp = CreationPoint(Block1, r_item=SomeImpossibleValue())
  v1 = SomeList(r_item=cp.r_item, creation_point=cp)

If later we figure out that the list could contain integers, e.g. because we
see an operation like:

    simple_call(v1.append, v2)    # v2 is SomeInteger()

Then we stop the analysis, and update the creation point object (which is not
immutable, unlike SomeXxx instances) to contain the more general value:

  v1.creation_point.r_item = union(v1.creation_point.r_item, v2)

And we restart the analysis from Block1:

  cp = find_existing_creation_point()
  v1 = SomeList(r_item=cp.r_item, creation_point=cp)

So the previous SomeList instance is never modified, but just dropped and
replaced by a new one.  The creation point itself is updated to include
information about how general the lists it creates should be.  It is much
cleaner to do it this way rather than patch the SomeList directly, because the
SomeList could already have been used at a lot of places, which would then
have to be manually invalidated because the SomeList became more general (this
is what we did with annotations).  The dependencies are clear: only the
CreationPoint object is modified, and it is only read during the creation
operation, so only the block containing this operation has to be invalidated.

Similarily, for object instances, the SomeInstance is never modified.  It
contains a reference to an InstanceCreationPoints, which records:

* all the blocks where an instance of a specific class is created;
* which attributes these instances should be given.

Only the InstanceCreationPoints is updated when an attribute must be added or
generalized; the already-existing SomeInstances are not altered or actively
deleted, but just forgotten.  The next generation of SomeInstances for the
same InstanceCreationPoints will replace them.

To do that, the mutable subset objects SomeList and SomeInstance could have a
"generation number": an object for the same creation point but with a more
recent generation number is more general (and thus replace during union) an
object from an older generation.  Using objects from outdated generations
could raise an exception that stops further analysis, thus waiting for an
object from the latest generation to reach that point and re-trigger the
analysis.


A bientôt,

Armin.



More information about the Pypy-dev mailing list