[Types-sig] RFC 0.1

Mon, 13 Dec 1999 13:09:15 -0500

> I don't think we would get anywhere if I just opened up the floor and
> had everyone yell their opinions about type safety. Here is a very rough
> starting point. Let's talk freely about it for a few days and then I'll
> try to direct the conversation based upon addressing the feedback.

Thanks for starting this, Paul!

> Version 0.1 Draft of a Pythonic Type Checking System
> ====================================================
> 
>     Guiding Principles in the System's Development 
>     ----------------------------------------------
> 
> #1. The system exists to serve the dual goals of catching errors
> earlier in the development process and improving the performance of
> Python compilers and the Python runtime. Neither goal should be
> pursued exclusively.

Hm, these may at times be very different goals.  I had a recent
private discussion about types where the two goals were referred to as
(OPT), for optimization, and (ERR), for error-detection.  One
observation is that while for (OPT) you may be able to get away with
aggressive whole-program type inferencing only, but for (ERR) you're
likely to *want* to declare types in certain cases; e.g. to prepare
for possible evolution of a module you may want to fix its API to a
subset of what is actually implemented.

> #2. The system must allow authors to make assertions about the sorts 
> of values that may be bound to names. These are called binding
> assertions. They should behave as if an assertion statement was
> inserted after every assignment to that name program-wide.

Technically, Python assert statements are only executed in
non-optimizing mode -- "assert 0" has no effect when you happen to use
"python -O" to execute your program.  But I presume that here you mean
assertions in the abstract conceptual sense.

> Note: this does in fact put more power in the hands of module
> developers. For the first time we will be able to say that
> sys.exit may not be overridden in user code and that sys.maxint cannot
> be changed to contain a string.

I think JPython secretly already imposes some of these restrictions
(in particular for the sys module!).

> Note: the term "sorts of values" is meant to be ambiguous: the
> definition of "type" in Python may undergo change in the future.
> 
> #3. Binding assertions must always be optional.
> 
> #4. There must be declarations that instruct static type checking
> software to verify that a function cannot violate binding assertions.
> These are called safety declarations.

I'm not sure what you mean here and how such declarations differ from
type assertions.  And I'm worried about the "must" part.  Please explain
better?

> #5. The introduction of binding assertions to a module should not
> change the perceived interface of functions and classes in the module.
> In other words, code that uses functions and classes from the module
> should not need to know whether it uses binding assertions or old
> fashioned assert statements.

Except that some unintended uses may become illegal while before you
might just have gotten away with them.

> #6. In the absence of local safety declarations, a static type checker
> should not by default report errors in otherwise legal Python code. In
> other words, a coder must ask (through function or module level
> declarations, command line switches or environment variables) for his
> or her code to be checked. In particular, a module cannot force client
> modules to be statically type checked (see #5, above).

However, there are some examples of dynamic code usage that are
fishy.  Examples include adding or changing globals in other modules
(except for the rare global that is intended to be a settable option),
or messing with the __builtin__ module.

> #7. The attachment of safety declarations to a function should not
> change the perceived interface of the function. In other words, code
> that uses the functions should not need to know that the function
> happens to be statically checkable.

But I'd still like to be able to be diagnosed at compile time instead
of at runtime when my code makes a statically illegal call to a
function with a safety declaration.

> #8. It is not a goal that a statically checkable function should only
> be able to call other statically checkable functions. Those other
> functions should be presumed to return a "PyObject" object.
> 
> #9. There should be a mechanism to assert that an object has a
> particular type for purposes of informing the static and dynamic type
> checkers about something that the programmer knows about the flow of
> the program.

Beyond "assert isinstance(object, type_or_class)" ?

> #10. In general, the mechanism should try to be "pythonic" which
> includes but is not limited to:
> 
>     * maximize simplicity
>     * maximize power
>     * minimize syntax
>     * be explicit
>     * be readable
>     * interoperate nicely with other features
> 
>     Temporary Goals and Non-Goals:
>     ------------------------------
> 
> #1. The first version of the system will be as neutral as possible on
> the issue of what defines a "type". Fulton's capability-based
> interfaces should be legal as types but so should type objects and
> classes.
> 
> Note: a purely interface based system cannot be feasible for testing
> until interfaces are embedded deeply into the existing Python library.
> It might be more philisophically pure to test for an abstract
> CharacterString interface but if the Python expression "abc" does not
> return an object that conforms to the interface then there is not much
> we can do. Some future version of the system may be restricted to only
> allow declared interfaces as types. Or it may be expanded to allow
> parameterized types.
> 
> #2. The first version of the system will not allow the use of types
> that cannot be referred to as simple Python objects. In particular it
> will not allow users to refer to things like "List of Integers" and
> "Functions taking Integers as arguments and returning strings."

It's been said before: that's a shame.  Type inference is seriously
hindered if it doesn't have such information.  (Consider a loop over
sys.argv; I want the checker to be able to assume that the items are
strings.)

> #3. The first version of the system will not define the operation of a
> type inferencing system. For now, all type declarations would need to
> be explicit.

I expect that this will make the system relatively heavy-weight and
hence unpythonic.  You'd be sprinkling way more type decls over your
source code than would be necessary with a somewhat more sophisticated
type checker.

> #4. The first version of the system will be syntactically compatible
> with Python 1.5.x in order to allow experimentation in the lead-up to
> an integrated system in Python 2.

I think that this is too much of a constraint, and may be informing
your preliminary design too much.  As long as an easy mechanical
transformation to valid Python 1.5.x is available, I'd be happy.

>     Definitions:
>     ------------
> Namespace creating suite:
>     The suite contained directly within a module, class or function
> definition.  
> 
> Statically available namespace creating suite:
>     The namespace creating suite defined by a module or class
> definition.  We do not consider the suite contained with a function as
> Statically available because the namespace only becomes available when
> the function is executed, not when it is declared.
> 
> Name binding statement, target:
>     An assignment statement (target), "def" statement ("funcname"),
> "class" ("classname") statement or "import" statement (module).  ***
> more thought about "from" version ***
> 
> Name declaration:
>     A name bound at the most out-dented context of a statically
> available namespace creating suite.

The indentation don't enter into it.  Consider

    if win32:
       def func(): ... # win32 specific version
    else:
       def func(): ... # generic version

> Classification:
>     Due to a shortage of synonyms for "type" that do not already have a
> meaning, we use the word "classification." 

Oh, dear.  Keep looking for a better synonym!

>     Given a value v and a value t, v conforms to classification t if 
>         t is returned by type( v )
>         t is returned by v.__class__
>         t is in v.__implements__ (the fulton convention)
>         t is the "object" classification
>         v is the value "None"
> 
> Classification Declaration:
>     A statement that precedes a name binding statement and declares
> the classifications that the name must conform to. The type
> declaration must textually precede any use of the name.
> 
> Classification Constraints:
>     A pair of statements declaring the classifications that values
> bound to a name must support. There are a few syntactic variations:
> 
>     1. A name binding statement preceded by a statement referencing a
> classification. 
> 
> <example>
> types.StringType
> a
> 
> class foo:
>     types.IntType
>     j=5
> </example>
> 
> This assertion is maintained by a combination of the static and
> dynamic type checkers. In order for the dynamic checker to work, we
> will need to modify the module_setattr and class_setattr functions for
> Python 1.6.
> 
>     2. A simple expression containing only a tuple where all but the
> last item reference a classification. The last item should be a
> locally declared name. The statement must occur in the most out-dented
> context of a namespace creating statement suite:
> 
> def foo(bar, baz):
>     types.IntType, bar
>     interfaces.NumericType, interfaces.SignedType, baz
> 
>     3. The classification of a function is always "function" but its
> return classification can be specified with a declaration:
> 
> <example>
> types.StringType
> def foo(): return "abc"
> </example>
> 
> This can be checked through the introduction of "virtual" assertion
> statements into byte-code:
> 
> <example>
> types.StringType
> def foo(): 
>     __tmp = "abc"
>     assert has_type( __tmp, types.StringType )
>     return "abc"
> </example>

Of course, in certain cases (as in this example) the type checker may
be able to prove that the assertion can never fail, and omit it.

>     4. The classification of class instance variables comes from the
> classification of the corresponding class variable. 
> 
> <example>
> class foo:
>     types.IntType
>     a=5
> 
>     types.ListType
>     b=None
> </example>

The initialization for b denies its type declaration.  Do you really
want to do this?  This doesn't look like it should be part of the
final (Python 2.0) version -- it's just too ugly.  How am I going to
explain this to a newbie with no programming *nor* Python experience?

> Classification-testing expression:
> 
> The function has_type takes a value and a reference to a
> classification or list of classifications. The return type of the
> function is the union of the classifications.

Perhaps this could be an extension of isinstance()?  (That already
takes both class and type objects.)

> Classification-safe Function:
> 
>     a function that can be checked at compile time not to violate any
> classification constraints by assigning invalid values to any
> constrained names:
> 
> Every reference to a name in a module or class (not instance!) must be
> to a declared (but perhaps not classification constrained) name.

Explain the reason for excluding instances?  Maybe I'm not very clear
on what you're proposing here.

> <note>
> Remember that variables without classification constraints can be
> presumed to conform to the "Object" type.
> </note>
> 
> Every expression must be type-checked based on the operators,
> constants and global and local name references.

Ah, good.  This implies the "no messing with builtins or other
modules' globals" rule that I'm proposing.

> Attribute assignments and references are checked based upon the
> asserted classifications of the owning object.
> 
> The classification of every assignment must be checked based on the
> types of constants, variables and function return types in the
> right-hand side.
> 
> The classification of every function parameter must be checked based
> on the classifications of the argument expression.
> 
> All return statements must be checked based on the classifications of
> the expressions.

OK.  I'm not sure everywhere whether you want compile-time or run-time
checking.  Perhaps you can clarify this?

--Guido van Rossum (home page: http://www.python.org/~guido/)