[Types-sig] recursive types, type safety, and flow analysis

scott scott@chronis.pobox.com
Wed, 22 Dec 1999 06:47:24 -0500

On Wed, Dec 22, 1999 at 01:45:43AM -0800, Greg Stein wrote:
> On Wed, 22 Dec 1999, scott wrote:
> I posted a set of 4 cases a few messages ago. Without control flow
> analysis, the type checker cannot determine which of the four cases is
> being used when it analyzes f(). 

for easy reference, they're at:

2 of those 4 cases use 'del' in the global space. I'm inclined to
believe that those 2 cases are beyond the scope of static type

I've even recently written code that does just that, creating a
module-as-object interface, where the deleted variables are parallell
to private variables in a class -- basically inaccessible. I would not
expect static typing mechanism to grok that module, and would be fine
with simply having it ignore the possibility of undefined names that
result from 'del' or other runtime behavior. It certainly seems like
an exceptional case that has undue complications in a static type
system.  Do you see any relatively easy way to handle it more
accurately ?

The other 2 cases seem handleable at compile time under a system which
builds typedecl information by reference (*) -- iteratively or 2-pass,
whichever works out best, sans flow-control analysis, and at compile
time, with a separate typedecl-name namespace mechanism.  Any
information distinguishing case 1 from case 2 (like redefining g() 
inside f) would either fall under local namespace type-ing  or something 
which should, IMO, not be handled by a static type system.

> The compiler *will* be able to generally verify types. It just can't
> handle a determine which of a set of alternatives an object will have at a
> specific point in type (assuming that object occurs in a different body of
> code than that which is being analyzed).
> Am I being clear enough? It seems like I've said this about three times so
> far...

Yes, I got it, or so I think :) But I think we may have 2 different
expectations of something fairly basic:

decl f(x: Int) -> Int|None
decl g(x: Int) -> Int

def g(x):
    return f(max(x, 1))

def f(x):
    if x > 0:
        return x
        return None

I *want* the static typing to complain and to be warned or blow up
with the message that according to the type information alone, g() is
not verifiably an Int.  It seems like you want this to work without
complaints.  Is this correct?

> > > The origination of this discussion was based on the recursive type issue.
> > > If we have runtime objects, then I doubt we could support the recursive
> > > type thing without some additional work. Or, as I'm suggesting, you do not
> > > allow an undefined name (as specified by runtime/execution order) to be
> > > used in a typedecl.

I still don't see how enforcing your suggestion allows any compile time
checking at all -- unless you you further qualify it with 'used in a typedecl
as it operates at run time'.  What happens at run time should become
more clear once we come up with a way to provide run time access to
compile time static typing.  Run time behavior could even be programmable.


> I'm suggesting a similar mechanism be made available to resolve the
> recursive typedecl issue. Specifically, we provide a way to create a
> partially-defined ("incomplete") typedecl object and bind that to a name.
> That name can then be used; later, the name will become fully specified.
> More thought is needed here, but I'll hold off as this is still premised
> on runtime typedecl availability.


yes, essentially the same way that

struct node {
    int i;
    node * n;

is OK, but

struct node {
    int i;
    node n;

isn't. (gives 'incomplete type' with gcc).  You need to be able to have
a reference to a type and alter it as the type declarations are

> >...
> > > I do believe the information goes into the bytecode, but I don't think
> > > that is the basis for needing to plan now. Instead, we have to define the
> > > semantics of when/where those typedecl objects exist. Do we have them at
> > > runtime? 
> > 
> > in the above, no, though we do have the ability to find a name
> > anywhere at compile time.

I'd like to recant this statement and replace it with:  

1) The typedecl information is stored in an application-wide static
type model which is created at compile time (implies typedecl specific
import/#include mechanism).

2)The model is mapped to something potentially available at run time,
eg bytecode with associated module, classe and function objects.

3)The runtime environment can do with that information what it
pleases, but 1) and 2) need to be done first, and have a lot of
potential for use, even without 3).

> > 
> > >Does a name have to exist (in terms of runtime execution) for it
> > > to be used in a typedecl, or does it just have to exist *somewhere*? 
> > 
> > in the above, it has to exist in the typedecl 'execution' model, which
> > is during compile time.
> >
> > >If
> > > names must exist before usage, then how is the recursive type thing
> > > handled? With unspecified typedecls? (like an unspecified struct)
> > 
> > How about an iterative model which continues until all typedecl names
> > are filled in?
> These three items form a possible alternative. You wouldn't really need an
> iterative model to gather typedecl names; two passes is sufficient.
> >...

> checking. If you're talking about a name having multiple types over a
> period of time, then I disagree: we can handle that case.

perhaps for local variables, but I don't see how with global variables
unless that global variable is explicitly stated to be a union by the
programmer, and the type model works out OK -- with atleast the option
of working with my expectation of static typing of (global) unions as
described above.

> Also, I think the runtime objects are for more than the occasional type
> assertion.

Indeed, there are lots of places where optimizations can be made at runtime if
the types are known.  I just don't think we're there yet.  I don't think
undefined names are an issue with possible future run-time optimizations and 
introspective interfaces, as it seems like a NameError would do it's job only
after all namespaces are searched, and that something introspective
wouldn't even run if a NameError was going to be raised.

For the moment, it seems sufficient to make static, compile-time type
information available at run time (optionally), and then we can decide
what to do with that information at run time - optimize code, keep
tabs on the types of variables at run time and make sure they match
the static type information, etc.

> > As weird as it is to have a separate type-decl name model, it seems
> > infintely  to depict dynamic typing in a static typing model.
      insert 'weirder ' here

guido should threaten to kill sigs more often :)


> > 
> > you could even allow typedecl to import modules for the sake of
> > gaining access to the names, where those imports would only occur when
> > the optional type checking is turned on.  I'd agree that the use of an
> > undefined name should be disallowed.  With the presence of
> > type-check-only import, following the same
> > no-mutually-recursive-imports rule of the regular import, but only
> > importing typedecl statements, you could achieve all this at compile
> > time. 
> Actually, the recursive import issue is resolved by have a module
> registered which is incomplete. If you have:
> --- a.py
> import b
> --- b.py
> import a
> >>> import a

right, but there are limits:

--- a.py
from b import c
d = 1

--- b.py
from a import d
c = 1

doesn't work:
1222 04:47 shock:~% python a.py
Traceback (innermost last):
  File "a.py", line 1, in ?
    from b import c
  File "b.py", line 1, in ?
    from a import d
  File "a.py", line 1, in ?
    from b import c
ImportError: cannot import name c