[Types-sig] recursive types, type safety, and flow analysis

Greg Stein gstein@lyra.org
Wed, 22 Dec 1999 01:45:43 -0800 (PST)


On Wed, 22 Dec 1999, scott wrote:
>...
> > One of my points was that I do not believe you can issue warnings because
> > you can't know whether a problem might exist. Basically, it boils to not
> > knowing whether a global used by a function exists at the time the
> > function is called. So you either issues warnings for all global usage, or
> > you issue none. You can make a few guesses based on what happens in the
> > global code body, but I don't think the guesses will really improve the
> > quality of warnings.
> 
> I personally can't imagine that it would be an issue to treat globals
> in functions as anything other than a simple flat-rule: for type
> checking purposes, globals must be defined at compile time in the
> global namespace, that's just me, but I'd probably fire any of the
> python programmers that work for me if they did what you describe
> above with globals in a large project :)

So it sounds like we agree? Treat globals simply, using a union of all the
types that they may have in the global space (of course, noting that most
sane people won't be changing the type!). Do not worry about control flow:
specifically, what is the type and/or defined-status when <this> function
is called.

> > Examples? No, I don't really have any handy. Any example would be a short
> > code snippet and people would say, "yah. that's bad. it should fail." But
> > the issue is with larger bodies of code... that's what we're trying to
> > fix! So... No, I don't have a non-trivial example.
> 
> I can't even imagine one, so if there's any way to describe this
> global issue a little further without putting too much effort into it,
> I'd appreciate it.

I posted a set of 4 cases a few messages ago. Without control flow
analysis, the type checker cannot determine which of the four cases is
being used when it analyzes f(). Now just take one of those four patterns
and drop it into a large module. Given that big old module, it would be
nice to find problems with sequencing of type/defined-ness and function
calls (because it is too big to eyeball; we want compiler support), but
I'm saying "punt -- the compiler is not going to be able to provide any
kind of adequate warning."

The compiler *will* be able to generally verify types. It just can't
handle a determine which of a set of alternatives an object will have at a
specific point in type (assuming that object occurs in a different body of
code than that which is being analyzed).

Am I being clear enough? It seems like I've said this about three times so
far...

> > The origination of this discussion was based on the recursive type issue.
> > If we have runtime objects, then I doubt we could support the recursive
> > type thing without some additional work. Or, as I'm suggesting, you do not
> > allow an undefined name (as specified by runtime/execution order) to be
> > used in a typedecl.
> 
> you could even allow typedecl to import modules for the sake of
> gaining access to the names, where those imports would only occur when
> the optional type checking is turned on.  I'd agree that the use of an
> undefined name should be disallowed.  With the presence of
> type-check-only import, following the same
> no-mutually-recursive-imports rule of the regular import, but only
> importing typedecl statements, you could achieve all this at compile
> time. 

Actually, the recursive import issue is resolved by have a module
registered which is incomplete. If you have:

--- a.py
import b

--- b.py
import a

>>> import a

Module "a" will get partially defined and then its code will be run.
During that execution, the "import b" occurs and the "b" module is
imported. Now the code for "b" runs and it says "import a". Since "a" has
been partially defined (specifically, a name/module is entered into
sys.modules), then b.py can create a local name "a" referring to the
module object that it finds in sys.modules (which is about to be filled
in when the "import b" completes).

I'm suggesting a similar mechanism be made available to resolve the
recursive typedecl issue. Specifically, we provide a way to create a
partially-defined ("incomplete") typedecl object and bind that to a name.
That name can then be used; later, the name will become fully specified.
More thought is needed here, but I'll hold off as this is still premised
on runtime typedecl availability.

>...
> > I do believe the information goes into the bytecode, but I don't think
> > that is the basis for needing to plan now. Instead, we have to define the
> > semantics of when/where those typedecl objects exist. Do we have them at
> > runtime? 
> 
> in the above, no, though we do have the ability to find a name
> anywhere at compile time.
> 
> >Does a name have to exist (in terms of runtime execution) for it
> > to be used in a typedecl, or does it just have to exist *somewhere*? 
> 
> in the above, it has to exist in the typedecl 'execution' model, which
> is during compile time.
>
> >If
> > names must exist before usage, then how is the recursive type thing
> > handled? With unspecified typedecls? (like an unspecified struct)
> 
> How about an iterative model which continues until all typedecl names
> are filled in?

These three items form a possible alternative. You wouldn't really need an
iterative model to gather typedecl names; two passes is sufficient.

>...
> For me, it is sufficient to proceed from the premiss that you can't
> have static typing work on code that redefines types at run time, and
> to limit runtime checking (for the time being) to optionally have the
> interpreter take some action (warn or abort) when that happens. That
> requirement alone implies that typedecl'd names and their typedecl
> bodies need to be available at run time, which is sufficient to
> support just about any future developments in a static-typeing
> interface in pure python.

I definitely agree with the second part. For the first, if I assume
"redefines types at run time" as being "through some shady mechanism,
redefining a typedecl object", then yes: we can/should limit static
checking. If you're talking about a name having multiple types over a
period of time, then I disagree: we can handle that case.

Also, I think the runtime objects are for more than the occasional type
assertion.

>...
> As an aside, I'm glad to learn it wouldn't be difficult to have python
> put static type information in it's byte code.  That seems like a good
> place for it.

I'm hoping it would be hung from the function, class, and module objects.

> As weird as it is to have a separate type-decl name model, it seems
> infintely  to depict dynamic typing in a static typing model.

I don't follow/parse this line...

Cheers,
-0g

-- 
Greg Stein, http://www.lyra.org/