[pypy-dev] Brainstorming at Python UK
hpk at trillke.net
Wed Apr 28 11:45:08 CEST 2004
[Armin Rigo Tue, Apr 27, 2004 at 10:15:50PM +0100]
> Here is as promized a few words about the "brainstorming" at Python UK about
> the annotation stuff.
> "Annotations" were the mean through which typing information about specific
> variables were stored. We'd like to replace them with explicit "abstract
> objects", which would be instances of simple classes like:
Maybe it makes sense to talk about low-level information/representation objects
rather than "abstract objects" which isn't a very expressive term IMO.
> class SomeObject:
> "Nothing is known about this value."
> class SomeInteger:
> def __init__(self, start=-sys.maxint-1, end=sys.maxint+1):
> self.start = start
> self.end = end
> class SomeList:
> def __init__(self, anyitem):
> self.anyitem = anyitem # another SomeXxx instance
> So the "type inference" phase of the translation would take a control flow
> graph as input, as it currently does; then its goal is to associate one
> SomeXxx instance to each variable. This information can then be used by the
> code generator (Pyrex or Lisp).
Makes sense to me ...
> A new idea inspired by Starkiller, which is an important difference with the
> old way annotations worked, is that all the SomeXxx instances are always
> immutables. For example, SomeInteger(0,10) is and will always mean an integer
> in range(0,10). If later a more general value is found that could arrive into
> the same variable, then the variable gets associated to a new, more general
> instance like SomeInteger(-10,10). This change triggers a recomputation of
> the type inference, and the new SomeInteger(-10,10) will eventually propagate
> forward to wherever the variable is used.
Actually we have triggered re-computation in previous schemes, too. Keeping
the low level representation objects "virtually immutable" seems like a good
simplifying idea ... if it works ...
> Here is an example:
> v4 = v1+v2
> v5 = v4+v3
> jump to Block2 with arg (v5)
Ok, i can see this working for variables containing "naturally"
immutable objects like integers, strings and numbers. But how does the
example apply to building a list in a loop? I am a bit doubtful about a
"virtually immutable" SomeList object unless you intend to use a low-level
representation like e.g.:
self.r_items = someitem # a SomeXxx instance general enough to
# hold all possible items of the list
self.r_indexes = SomeInteger(0, sys.maxint-1)
Is something like this the underlying idea to allow "virtually
immutable" low level representation objects of
Note that i guess that marking low level representations with 'r_' or
some such might make sense. This is similar to what we do for
app-level representation objects with 'w_' to indicate they are
'wrapped' and only an object space knows how to operate on them.
> Having immutable SomeXxx instances even for mutable objects is very useful for
> the object-oriented part of the analysis. Say that a SomeInstance would
> represent a Python instance, in the old-style way: a __class__ and some
> attributes. The SomeInstance is created at the point in the flow graph that
> creates the instance:
> v2 = simple_call(<class X>)
> This would register somewhere that the class X can be instanciated from this
> point in the flow graph. It would then create and store into v2 a
> SomeInstance of class X, which initially has no known attribute. If an
> attribute is added later on v2, then the SomeInstance detects it is not
> general enough. It kills the inference process; it records the existence of
> the new attribute in some data structure on the class X;
It should probably store it in the SomeInstance instance associated to
class X, right?
> It seems that we will end up cancelling and restarting large parts of the
> analysis over and over again this way, but it may not be a big problem in
> practice: we can expect that a lot of information about the attributes will be
> known already after the __init__ call completed. We may stop and re-analyse
> __init__ itself 7 times consecutively if it defines 7 attributes on self, each
> time progressing a bit further until the next new attribute kills us, but that
> shouldn't be a big problem because it is fairly local (but we'll have to try
> to analyse bigger programs to really know). This is much cleaner this way
> than with the annotation stuff.
Yes, i agree that it seems so. But we have had schemes coming and failing so i am eager
to get an idea of how it works for the problem cases (lists, instances, ...)
we can identify. Also how to represent exceptions at a lower level and
then translate them to the target language is not clear yet.
More information about the Pypy-dev