[Types-sig] Static name checking
Paul Prescod
paul@prescod.net
Wed, 08 Dec 1999 09:41:30 -0500
Martijn Faassen wrote:
>
> > My assertion is that the first step is to statically check coherence
> > among Python's three (?) (function, module, builtin) runtime namespaces.
>
> What about classes referring to attributes of 'self', for instance,
> though? I'm still not entirely clear on what you're trying to
> accomplish, I'm afraid.
Let me see if I can do this with some invented notation. You'll have to
cut me some slack for typos and omitted (hopefully irrelevant) details
like about a dozen levels of the parse tree.
Let's pretend we are talking about Java. Let's pretend that we are
implementing a Java interpeter (including compiler) in the most
straight-forward (not efficient) way.
Consider the code:
class J{
String a;
void foo(){
a.whatever();
}
}
1. Parse it into tokens: (roughly)
(classdef "J"
(attributedef name: "a"
type: (name-ref "String" ))
(functiondef "foo"
(function-body
(method-call
object: (name-ref a)
method: "whatever"))))
That's very rough because you guys know about parse trees already.
2. Build "compile time objects" and replace variable references with
pointers to "compile time objects":
[class java.lang.String ....]
[class J
attributes: {"a": <reference-to java.lang.String>}
functions: {"foo": ...
(method-call
object: <reference-to mymod.J.a>
method: "whatever"))))
Step 2 is the step that Python doesn't have right now.
Note that at the end of this step, the references to names in "static"
namespaces have all been resolved but names in methods have NOT been
resolved. One obvious reason for this is that Java and Python both allow
forward references so maybe I don't even know what the methods of
Strings are yet.
3. Conceptually, once all of the type and variable objects are built,
THEN I can go through and check that the operations applied to types are
legal. ONE SUCH OPERATION is ".whatever". It becomes possible to check
that ".whatever" is legal at the same time that it becomes legal to
check whether "a+b" is legal.
4. Generate bytecode.
5. Run it.
Python has steps 1, 4 and 5 but skips steps 2 and 3. I am trying to get
us to the point where we can do step 2 so that we can get to step 3
eventually.
I once wrote a compiler and I beat my head against a wall until I
realized that foo.bar resolution is a massively different problem if foo
is a module (doesn't rely on type system) or a class (does rely on type
system).
The point of the "static" keyword is to allow Python author to say:
"Some of my modules are static like Java modules. Please resolve
references to these at compile time, not runtime."
> > Once we have name checking then we can design a syntax to statically
> > associate types with names. THEN we can do static type checking. I could
> > be wrong but it seems to me that once this is done, the definition of
> > swallow will be trivial: "A statically compilable Python module is a
> > file where every name is frozen and every name has a type declaration."
>
> And every imported module has the same properties, including this one.
> Though of course you may mean this with 'every name'. I don't think
> it'll be that trivial, as you haven't defined 'type declaration'; you
> run into complexities here,
> especially if you involve classes and objects.
That's true. That's why I'm not trying to solve that part of the problem
yet.
> Can't you go about it the other way around? First, you make a type
> declaration for
> all names in a module.
There are four declarations we could imagine. Each is a little bit
stronger than the previous.
1. "I believe that every name in this module/class that is not an
attribute name can be statically resolved."
2. "I believe that this module can be used in other modules where every
non-attribute name is supposed to be statically resolved."
3. "I believe that every name in this module/class can be statically
type checked (including attribute name checking)".
4. "I believe that this module/class can be used in other modules where
every name can be statically type checked."
"freeze" could be 1. There is probably not much virtue in separating 1
and 2 so we could rather say that "freeze" actually means 2 which
implies 1. A new, "type-safe" keyword might be used for 4 which again
would imply 3 (and 2, and 1).
If this is to be optional, off-by-default type and name checking then we
need a way to turn it ON.
"freeze" might be enough to allow some type inferencing and early
binding (for performance, not safety) .
"type-safe" would be used for performance and safety at a price that it
would require you to stick to the "Java subset".
> Then you check (somehow) if there isn't code that
> contradicts this type definition; that is, there should be no
> assignments of one name to another that violates the static type
> definitions, no attribute accesses to undefined attributes, and so on.
> Of course you instantly produce errors if any name doesn't have a type
> definition.
Here's the rub. In my mind, this should be legal code:
def doubleString( String b ):
return b*2
def doit():
doubleString( eval( raw_input() ) )
doit()
That's what Python users will expect and that's also what VB does. This
should be checked at *runtime*. On the other hand, THIS would cause a
static type error:
type-safe
def doit():
doubleString( eval( raw_input() ) )
That's illegal because it claims to be type-safe but isn't really. The
same goes for static:
static
def foo():
return this.that.b()
is only valid if this.that is statically resolvable. (e.g. a module, not
a class, and a module that is itself static)
> Another
> prerequisite for a type checker is the determination of the Swallow
> subset. For instance one can imagine that in Swallow it's illegal to
> import modules except on the top. I imagine these limitations will
> become more obvious after a type system has been developed.
I think we disagree on the granularity of the project. It should be
possible to declare individual functions, classes or methods statically
type checkable, not just whole modules.
> A run-time layer can check whether each python object that is sent
> into Swallow conforms to the type definitions; a function expecting an
> int must indeed be sent an int object. If not, somekind of exception
> should be raised.
I agree. But I think we also need a way to say: "I want you to check
this code at compile time, not runtime."
--
Paul Prescod - ISOGEN Consulting Engineer speaking for himself
Floggings will increase until morale improves.