[Python-Dev] Yet another type system -- request for comments on a SoC proposal

Sat May 6 04:17:29 CEST 2006

I'm currently revising a proposal for the Google Summer of Code, and it
was suggested that I start a thread here to get input. Apologies for the
length, but I wanted this to more than just a link to my proposal.

The short version of my proposal is: The module would provide a
system by which the manifests* of function parameters was inferred
through the testing process, and those manifests would be used to check
the arguments to function calls at runtime. In addition to a form of
runtime type checking, this information would also be used to
automatically generate up-to-date documentation.

*: I'm using the word 'manifest' to refer to the list of methods /
attributes provided by an object -- that is, what you get from `dir(obj)`.

I've built a *very* simple proof of concept here:
http://www.cs.uoregon.edu/~ryanf/verify.py

My full proposal is here:
http://www.cs.uoregon.edu/~ryanf/proposal.txt

It's been suggested that this could, instead, be an extension to one of
the lint tools; while I'm open to that suggestion it seems to me like
that would seriously change the character of the modified lint tool, as
it would be running tests rather than simply analyzing source.

So why have something automatically generate signatures, rather than
explicitly declare them? Currently, the knowledge about a function's
signature may be stored in four places (other than the function's calls):

1. The function itself: the programmer expects to be passed a group of
   objects which implement certain functions, and uses them as such.
2. The tests run against the function.
3. A sequence of checks on the passed objects, though the author may
   trust the function user to pass the right thing.
4. The documentation for the function, though, again, the author may
   leave that out.

The proposed module would use the knowledge embedded in the system by
the programmer in the first two points to generate the other
information. As the function changes over time, these generated items
will be kept up to date simply by re-running the new test cases.

A sample program using this module might look like the following:

    import sigcheck

    verify = sigcheck.decorator('./verify.pickle')      # note 1
    sigcheck.mode('learn')

    @verify                                             # note 2
    def foo(file, b, c):
        # ... do stuff ...

    class A:
        @verify
        def foo(self, a, b):
            return a * b

        # and, the worst case:
        @verify(('a', sigcheck.Int),                    # note 3.1
                ('b', ('m1', 'm2', 'm3')),              #      3.2
                ('c', sigcheck.Int, sigcheck.String),   #      3.3
                returns=sigcheck.String,                #      3.4
                raises=(MyException, MyOtherExc))       #      3.5
        def bar(self, a, b, c):
            # ... do some stuff...

1. The infered information needs to be stored somewhere. A file could
   hold the pickled data structure between testing runs.

2. This line shows the most basic use of the decorator. The system
   uses inferrence for all the parameters.

3. There may be cases where the inferences are incorrect, the user
   wishes to be totally explicit about certain parameters, or wishes
   to fold the inferences into their code explicitly. By passing a
   data structure to the decorator they may partially or completely
   bypass the inferences.
   1. The system will come with some common 'types' built in.
   2. The user may also require manifests, free-form.
   3. If a parameter may have several, very different manifests, they
      may simply be listed.
   4. The return type may also be specified.
   5. The function may raise exceptions which are expected; these
      should be specified so the inference system doesn't treat them
      as errors.