[Types-sig] PyDL RFC 0.01

Paul Prescod paul@prescod.net
Sun, 26 Dec 1999 22:45:12 -0500


I've been off-list for a few days so if this RFC doesn't include the
last few day's feedback, I apologize in advance.

PyDL RFC 0.01
=============

A PyDL file declares the interface for a Python module. PyDL files
declare interfaces, objects and the required interfaces of objects.

At some point in the future, PyDL files will likely be generated from
source code using a combination of declarations within Python code and
some sorts of interface deduction and inferencing based on the contents
of
those files. For version 1, however, PyDL files are separate although
they do have some implications for the Python runtime.

This document describes the behavior of a class of software modules
called "static interface interpreters" and "static interface
checkers". Interface interpreters are run as part of the regular
Python module interpetation process. They read PyDL files and make the
type objects available to the Python interpreter. Interface checkers
read interfaces and Python code to verify conformance of the code to
the interface.

Concepts:
=========

An interface is a Python object with the following attributes:

__conforms__ : def (obj: Any ) -> boolean
__class_conforms__ : def (obj: Class ) -> boolean

(the rest of the interface reflection API will be worked out later)

Interfaces can be created through interface definitions and typedefs.
There may also be facilities for creating interfaces at runtime but
they are neither available nor relevant to the interface interpreter.

Interface definitions are similar to Python class definitions. They
use the keyword "interface" instead of the keyword "class". 

Sometimes an interface can be specialized for working with specific
types. For instance a list could be specialized for working with
integers. We call this "parameterization". A type with unresolved
parameter variables is said to be "parameterizable". A type with some
resolved parameter variables is said to be "partially resolved."
A type with all parameter variables resolved is said to be "fully
resolved."

Typedefs allow us to give names to partially or fully resolved 
instantiations of interfaces.

In addition to defining interfaces, it is possible to declare other
attributes of the module. Each declaration associates an interface
with the name of the attribute. Values associated with the name in the
module namespace must never violate the declaration. Furthermore, by
the time the module has been imported each name must have an
associated value.

Behavior:
=========

The Python interpreter invokes the static interface interpreter and
optionally the interface checker on a Python file and its associated
PyDL file.  Typically a PyDL file is associated with a Python file
through placement in the same path with the same base name and a
".pydl"  or ".gpydl" extension. "Non-standard" importer modules may
find PyDL files using other mechanisms such as through a look-up in an
relational database.

The interface interpreter reads the interface file and builds the
relevant type objects. If the interface file refers to other modules
then the interface interpreter can read the interface files associated
with those other modules. The interface interpreter maintains its own
module dictionary so that it does not import the same module twice.

The Python interpreter can optionally invoke the interface checker
after the interface interpreter has built type objects and before it
interprets the Python module.

Once it interprets the Python code, the type objects are available to
the runtime code through a special namespace called the "interface
namespace". This namespace is interposed in the name search order
between the module's namespace and the built-in namespace.

Type expression language:
=========================

Type expressions are used to declare the types of variables and to
make new types. In a type expression you may:

1. refer to a "dotted name" (local name or name in an imported module)

2. make a union of two or more types:

integer or float or complex

3. parametrize a type:

Array( Integer, 50 )

Note that the arguments can be either types or simple Python
expressions. A "simple" Python expression is an expression that does
not involve a function call.

4. use a syntactic shortcut:

[Foo] => Sequence( Foo ) # sequence of Foo's
{a:b} => Mapping( a, b ) # Mapping from a's to b's
(a,b,c) => Record( a, b, c ) # 3-element sequence of type a, followed by
b
followed by c

5. Declare un-modifability:

const [const Array( Integer )]

Declarations in a PyDL file:
============================

(formal grammar to follow)

 1. Imports

An import statement in an interface file loads another interface file.

 2. Basic attribute type declarations:

decl myint as Integer                   # basic 
decl intarr as Array( Integer, 50 )     # parameterized
decl intarr2 as Array( size = 40, elements = Integer ) 
					# using keyword syntax

Attribute declarations are not parameteriable. Furthermore, they must
resolve to fully parameterized (not parameterizable!) types.

 3. Callable object type declarations:

Functions are the most common sort of callable object but class
instances can also be callable. They may be runtime parameterized
and/or type parameterized.  For instance, there might be a method
"add" that takes two numbers of the same type and returns a number of
that type.

decl Add(_X: Number) as def( a: const _X, b: const _X )-> _X

 4. Class Declarations

A class is a callable object that can be subclassed.  Currently the
only way to make those (short of magic) is with a class declaration,
but one could imagine that there might someday be an __subclass__
magic method that would allow any old object instance to also stand in
as a class.

decl TreeNode(_X: Number) as 
        class( a: _X, Right: TreeNode( _X ) or None,
                    Left: TreeNode( _X ) or None )
                -> ParentClasses, Interfaces

 5. Interface declarations:

interface (_x,_y) foo(a, b ):
    decl shared somemember as _x
    decl someOtherMember as _y
    decl shared const someClassAttr as List( _x )

    decl shared const someFunction as def( a: Integer, b: Float ) ->
String

 6. Typedefs:

Typedefs allow interfaces to be renamed and for parameterized
variations of interfaces to be given names.

typedef PositiveInteger as BoundedInt( 0, maxint )
typedef NullableInteger as Integer or None
typedef Dictionary(_Y) as {String:_Y}

The Undefined Object:
=====================

The undefined object is used as the value of unassigned attributes and
the return value of functions that do not return a value. It may not
be bound to a name.

a = Undefined   # raises UndefinedValueError
a = b           # raises UndefinedValueError if b has not been assigned

Undefined CAN be compared.

if a==Undefined:
    blah
    blah
    blah

New Runtime Function:
=====================

conforms( x: Any, y: Interface ) -> Any or Undefined

This function can be used in various ways. Here it is used as an
assertion:

j = conforms( j, Integer )

which is equivalent to:

if isinstance( j, Integer ):
    raise UndefinedValueError

Here it is test:

if conforms( j, Integer )!=Undefined:
    anint = conforms( j, Integer )

which is equivalent to the very similar isinstance based code:

if isinstance( j, Integer ):
    anint = j

Experimental syntax:
====================

There is a backwards compatible syntax for embedding declarations in a
Python 1.5x file:

"decl","myint as Integer"
"typedef","PositiveInteger as BoundedInt( 0, maxint )"

There will be a tool that extracts these declarations from a Python
file to generate a .gpydl (for "generated PyDL") file. These files are
used alongside hand-crafted PyDL files. The "effective interface" of
the file is evaluated by combining the declarations from the same file
as if they were concatenated together (more or less...exact details to
follow). The two files must not contradict each other, just as
declarations within a single file must not contradict each other.

Over time the generation of the .gpydl file may be more intelligent
and may deduce type information based on code outside of explicit
declarations (for instance function and class definitions, assignment
statements and so forth).

Runtime Implications:
=====================

All of the named types defined in a PyDL file are available in the
"types" dictionary that is searched between the module dictionary and
the built-in dictionary.

The runtime should not allow an assignment or function call to violate
the declarations in the PyDL file. In an "optimized speed mode" those
checks would be disabled.