[Types-sig] RFC 0.1
Guido van Rossum
guido@CNRI.Reston.VA.US
Mon, 13 Dec 1999 13:09:15 -0500
> I don't think we would get anywhere if I just opened up the floor and
> had everyone yell their opinions about type safety. Here is a very rough
> starting point. Let's talk freely about it for a few days and then I'll
> try to direct the conversation based upon addressing the feedback.
Thanks for starting this, Paul!
> Version 0.1 Draft of a Pythonic Type Checking System
> ====================================================
>
> Guiding Principles in the System's Development
> ----------------------------------------------
>
> #1. The system exists to serve the dual goals of catching errors
> earlier in the development process and improving the performance of
> Python compilers and the Python runtime. Neither goal should be
> pursued exclusively.
Hm, these may at times be very different goals. I had a recent
private discussion about types where the two goals were referred to as
(OPT), for optimization, and (ERR), for error-detection. One
observation is that while for (OPT) you may be able to get away with
aggressive whole-program type inferencing only, but for (ERR) you're
likely to *want* to declare types in certain cases; e.g. to prepare
for possible evolution of a module you may want to fix its API to a
subset of what is actually implemented.
> #2. The system must allow authors to make assertions about the sorts
> of values that may be bound to names. These are called binding
> assertions. They should behave as if an assertion statement was
> inserted after every assignment to that name program-wide.
Technically, Python assert statements are only executed in
non-optimizing mode -- "assert 0" has no effect when you happen to use
"python -O" to execute your program. But I presume that here you mean
assertions in the abstract conceptual sense.
> Note: this does in fact put more power in the hands of module
> developers. For the first time we will be able to say that
> sys.exit may not be overridden in user code and that sys.maxint cannot
> be changed to contain a string.
I think JPython secretly already imposes some of these restrictions
(in particular for the sys module!).
> Note: the term "sorts of values" is meant to be ambiguous: the
> definition of "type" in Python may undergo change in the future.
>
> #3. Binding assertions must always be optional.
>
> #4. There must be declarations that instruct static type checking
> software to verify that a function cannot violate binding assertions.
> These are called safety declarations.
I'm not sure what you mean here and how such declarations differ from
type assertions. And I'm worried about the "must" part. Please explain
better?
> #5. The introduction of binding assertions to a module should not
> change the perceived interface of functions and classes in the module.
> In other words, code that uses functions and classes from the module
> should not need to know whether it uses binding assertions or old
> fashioned assert statements.
Except that some unintended uses may become illegal while before you
might just have gotten away with them.
> #6. In the absence of local safety declarations, a static type checker
> should not by default report errors in otherwise legal Python code. In
> other words, a coder must ask (through function or module level
> declarations, command line switches or environment variables) for his
> or her code to be checked. In particular, a module cannot force client
> modules to be statically type checked (see #5, above).
However, there are some examples of dynamic code usage that are
fishy. Examples include adding or changing globals in other modules
(except for the rare global that is intended to be a settable option),
or messing with the __builtin__ module.
> #7. The attachment of safety declarations to a function should not
> change the perceived interface of the function. In other words, code
> that uses the functions should not need to know that the function
> happens to be statically checkable.
But I'd still like to be able to be diagnosed at compile time instead
of at runtime when my code makes a statically illegal call to a
function with a safety declaration.
> #8. It is not a goal that a statically checkable function should only
> be able to call other statically checkable functions. Those other
> functions should be presumed to return a "PyObject" object.
>
> #9. There should be a mechanism to assert that an object has a
> particular type for purposes of informing the static and dynamic type
> checkers about something that the programmer knows about the flow of
> the program.
Beyond "assert isinstance(object, type_or_class)" ?
> #10. In general, the mechanism should try to be "pythonic" which
> includes but is not limited to:
>
> * maximize simplicity
> * maximize power
> * minimize syntax
> * be explicit
> * be readable
> * interoperate nicely with other features
>
> Temporary Goals and Non-Goals:
> ------------------------------
>
> #1. The first version of the system will be as neutral as possible on
> the issue of what defines a "type". Fulton's capability-based
> interfaces should be legal as types but so should type objects and
> classes.
>
> Note: a purely interface based system cannot be feasible for testing
> until interfaces are embedded deeply into the existing Python library.
> It might be more philisophically pure to test for an abstract
> CharacterString interface but if the Python expression "abc" does not
> return an object that conforms to the interface then there is not much
> we can do. Some future version of the system may be restricted to only
> allow declared interfaces as types. Or it may be expanded to allow
> parameterized types.
>
> #2. The first version of the system will not allow the use of types
> that cannot be referred to as simple Python objects. In particular it
> will not allow users to refer to things like "List of Integers" and
> "Functions taking Integers as arguments and returning strings."
It's been said before: that's a shame. Type inference is seriously
hindered if it doesn't have such information. (Consider a loop over
sys.argv; I want the checker to be able to assume that the items are
strings.)
> #3. The first version of the system will not define the operation of a
> type inferencing system. For now, all type declarations would need to
> be explicit.
I expect that this will make the system relatively heavy-weight and
hence unpythonic. You'd be sprinkling way more type decls over your
source code than would be necessary with a somewhat more sophisticated
type checker.
> #4. The first version of the system will be syntactically compatible
> with Python 1.5.x in order to allow experimentation in the lead-up to
> an integrated system in Python 2.
I think that this is too much of a constraint, and may be informing
your preliminary design too much. As long as an easy mechanical
transformation to valid Python 1.5.x is available, I'd be happy.
> Definitions:
> ------------
> Namespace creating suite:
> The suite contained directly within a module, class or function
> definition.
>
> Statically available namespace creating suite:
> The namespace creating suite defined by a module or class
> definition. We do not consider the suite contained with a function as
> Statically available because the namespace only becomes available when
> the function is executed, not when it is declared.
>
> Name binding statement, target:
> An assignment statement (target), "def" statement ("funcname"),
> "class" ("classname") statement or "import" statement (module). ***
> more thought about "from" version ***
>
> Name declaration:
> A name bound at the most out-dented context of a statically
> available namespace creating suite.
The indentation don't enter into it. Consider
if win32:
def func(): ... # win32 specific version
else:
def func(): ... # generic version
> Classification:
> Due to a shortage of synonyms for "type" that do not already have a
> meaning, we use the word "classification."
Oh, dear. Keep looking for a better synonym!
> Given a value v and a value t, v conforms to classification t if
> t is returned by type( v )
> t is returned by v.__class__
> t is in v.__implements__ (the fulton convention)
> t is the "object" classification
> v is the value "None"
>
> Classification Declaration:
> A statement that precedes a name binding statement and declares
> the classifications that the name must conform to. The type
> declaration must textually precede any use of the name.
>
> Classification Constraints:
> A pair of statements declaring the classifications that values
> bound to a name must support. There are a few syntactic variations:
>
> 1. A name binding statement preceded by a statement referencing a
> classification.
>
> <example>
> types.StringType
> a
>
> class foo:
> types.IntType
> j=5
> </example>
>
> This assertion is maintained by a combination of the static and
> dynamic type checkers. In order for the dynamic checker to work, we
> will need to modify the module_setattr and class_setattr functions for
> Python 1.6.
>
> 2. A simple expression containing only a tuple where all but the
> last item reference a classification. The last item should be a
> locally declared name. The statement must occur in the most out-dented
> context of a namespace creating statement suite:
>
> def foo(bar, baz):
> types.IntType, bar
> interfaces.NumericType, interfaces.SignedType, baz
>
> 3. The classification of a function is always "function" but its
> return classification can be specified with a declaration:
>
> <example>
> types.StringType
> def foo(): return "abc"
> </example>
>
> This can be checked through the introduction of "virtual" assertion
> statements into byte-code:
>
> <example>
> types.StringType
> def foo():
> __tmp = "abc"
> assert has_type( __tmp, types.StringType )
> return "abc"
> </example>
Of course, in certain cases (as in this example) the type checker may
be able to prove that the assertion can never fail, and omit it.
> 4. The classification of class instance variables comes from the
> classification of the corresponding class variable.
>
> <example>
> class foo:
> types.IntType
> a=5
>
> types.ListType
> b=None
> </example>
The initialization for b denies its type declaration. Do you really
want to do this? This doesn't look like it should be part of the
final (Python 2.0) version -- it's just too ugly. How am I going to
explain this to a newbie with no programming *nor* Python experience?
> Classification-testing expression:
>
> The function has_type takes a value and a reference to a
> classification or list of classifications. The return type of the
> function is the union of the classifications.
Perhaps this could be an extension of isinstance()? (That already
takes both class and type objects.)
> Classification-safe Function:
>
> a function that can be checked at compile time not to violate any
> classification constraints by assigning invalid values to any
> constrained names:
>
> Every reference to a name in a module or class (not instance!) must be
> to a declared (but perhaps not classification constrained) name.
Explain the reason for excluding instances? Maybe I'm not very clear
on what you're proposing here.
> <note>
> Remember that variables without classification constraints can be
> presumed to conform to the "Object" type.
> </note>
>
> Every expression must be type-checked based on the operators,
> constants and global and local name references.
Ah, good. This implies the "no messing with builtins or other
modules' globals" rule that I'm proposing.
> Attribute assignments and references are checked based upon the
> asserted classifications of the owning object.
>
> The classification of every assignment must be checked based on the
> types of constants, variables and function return types in the
> right-hand side.
>
> The classification of every function parameter must be checked based
> on the classifications of the argument expression.
>
> All return statements must be checked based on the classifications of
> the expressions.
OK. I'm not sure everywhere whether you want compile-time or run-time
checking. Perhaps you can clarify this?
--Guido van Rossum (home page: http://www.python.org/~guido/)