[Types-sig] Proposed Goals PEP

Krishnaswami, Neel neelk@cswcasa.com
Mon, 12 Mar 2001 11:57:03 -0500


Paul Prescod [mailto:paulp@ActiveState.com] wrote:
> 
> I propose that the first round of the types-sig focus on a syntax for
> type declarations in classes (and by extension interfaces). 
>
> Here are the goals expanded:
> 
> 1. Better error checking. (major)
> If there are many levels of code it is not at all clear where 
> the value first went "bad". We want to make it possible to detect 
> the mistake much earlier. 
> 
> The canonical "use case" for this is the standard library. 
> The standard library should check argument values for sanity as 
> much as is is practical. Right now it does not do much checking 
> because the syntax for doing so is obfuscatory and verbose.

Agreed. This is the big one. 
 
> 2. Better Documentation (major)
> 
> Right now, it is basically impossible for programs like PyDoc 
> or IDLE to detect the types of arguments to functions. This makes 
> it difficult to give the user a clear summary of what a function 
> expects or returns.

This isn't going to help unless you have parametrized types. The
most common case of type error in my code is when I pass in a list 
or dictionary with bogus element types. This is because usually a 
general list or dictionary of a particular shape is being used as 
a custom "type." Errors with incorrectly handling primitive types 
just doesn't seem to happen to me -- even the string/sequence 
thing basically never bites me.

I would be interested in other peoples' experiences, since I 
would be surprised 

> 3. Optimization (minor)
> 
> If a Python compiler knew that a function works on integers or strings
> then it could internally use bytecode or native code optimized for
> integers or strings.

This is a totally unrealistic expectation. Adding typechecking is 
going to make typed Python code slower, not faster, for a very long 
time. 

This is for two reasons. One, the number of typechecks that the 
interpreter does will go up, and without an optimization pass (which 
is error prone to write) they can't be removed. Two, much of the
optimization will be from improving the representation of objects,
and that would require major surgery to the C FFI. 
 
This should not be a 1.0 goal -at all-, except in the modest sense 
that typed code shouldn't run too much more slowly than ordinary 
Python code. If you want to improve performance, focus on improving 
performance as a *separate* task -- stealing Scheme interpreter
implementation tricks can win a 2-5x speedup without changing
Python's semantics at all.

So the order should be err, doc, glue, with no mention of opt
whatsoever. 

> In order to get something done in a reasonable amount of time we must
> decide which issues not to tackle. At the top of that list, in my
> opinion, is static type checking (including a formally defined type
> inferencing system). Second is any form of higher order types. If we
> could agree that these are NOT on the table we will make progress much
> more quickly.

Type inference in the presence of subtyping is still generating 
research papers, which should scare you a lot. Even static
typechecking is also unfeasible with the current semantics of 
Python, becauseit would require substantial partial evaluation 
(calculate which clases are live and what their methods are), which 
is also a subject generating research papers. Therefore, both of
these should be off-limits.

However, without parametric types there's really no gain from 
type declarations -- there's nothing that you can't do today
with NotImplementedError.

> In general, a major constraint is that a type system for Python should
> be simple. We should not be afraid to teach the type system to high
> school students. If Python's type system makes anyone's brain explode
> then we have not done a good job.

Yep.

--
Neel Krishnaswami
neelk@cswcasa.com