[Types-sig] RFC 0.1

Guido van Rossum guido@CNRI.Reston.VA.US
Tue, 14 Dec 1999 14:17:23 -0500


[Christian Tismer]
> Allow me a question about types:
> 
> Where are the limits between types, values, and properties of values?
> 
> Assume a function which returns either
> [1, 2, 3] or the number 42.
> 
> We now know that we either get a list or an integer.
> But in this case, we also know that we get a list of three
> integer elements which are known constants, or we get
> the integer 42 which is even, for instance.
> 
> So what is 'type', how abstract or concrete should it be,
> where is the cut?

Good questions.  I'd like to remember all of this information.  It can
help with optimization (through constant folding).  It can help detect
unreachable code (e.g. your example function always returns a true
value).  Etc., etc.

Note that this can all be folded into a sufficiently rich type system;
a type is nothing more than a (possibly infinite) set of values.

> At the same time, Python is so rich from self-inspection that
> writing a dynamic type inference machine seems practicable,
> so how about not declaring types, but asking your code about its
> type?

I suppose you could do symbolic execution on the bytecode, but I don't
think this is a very fruitful path.  (Of course if anyone can prove
I'm wrong, it's you. :-)

> I could imagine two concepts working together:
> 
> Having optional interfaces, which is a different issue
> and looks fine (Jim's 0.1.1 implementation).
> 
> Having dynamic type inference, which is implemented by cached
> type info at runtime.

Eh?  Type inference is supposed to be a compile-time thing.  You
present your whole Python program to the typechecker and ask it "where
could this crash if I sent it on rocket to Mars?"

> (I hope this idea isn't too simple minded)
> Assume for instance the string module, implemented in Python.
> It would have an interface which defines what goes in and
> out of its functions.
> 
> At "compile" time of string.py, type inference can partially
> take place already when the code objects are created. The interface
> part creates restrictions on argument values, which can be used
> for further inference. It can also be deduced whether the return
> values already obey the interface or if deduction for imported
> functions is necessary.
> This info is saved in some cache with the compilation.
> Changes to the module object simply break the cache.

And that's exactly the problem.  I want to be able to be told whether
the cache might be broken *before* I launch my rocket to Mars.

> When I dynamically redefine a piece of the module where it
> depends of (say I assign something new to "_lower"), then
> the analysis must be carried out again, recursively invalidating
> other cached info as necessary.

In my scenario, the assignment to _lower is either detected and taken
into account by the type checker, or forbidden.  But this decision is
taken at compile time and if forbidden, it is flagged as a compile
time error.  If you exec code that could make this assignment that
would be a run-time error (it's also forbidden at run-time) but
typically, the Mars lander isn't going to accept input for exec from
the Martians -- we could probably flag all uses of exec (and eval()
and a few others) as errors unless there's a try/except around them.

> Well, this is an example where I think the restriction to
> type checking of expressions still applies, but something more is
> needed to trigger this check early.
> The involved namespace object is the string module's __dict__,
> which should know that it is referenced by this expression:
> 
> def lower(s):
> 	res = ''
> 	for c in s:
> 		res = res + _lower[ord(c)]
> 	return res
> 
> And by assignment to the name "_lower" in this case, it could
> invalidate code object lower's type cache. lower can no more
> assure that it will return a string result and will trigger
> its interface object to re-check consistency. The latter
> will raise an interface_error if the rule doesn't match.
> 
> It remains an open question for me how deeply possible
> values should be checkable, i.e. "this arg has to be a list
> which is not empty". Minor point, maybe.
> 
> Did I make some sense, or am I off the track? - chris

Read my case study.

--Guido van Rossum (home page: http://www.python.org/~guido/)