[Python-Dev] type categories

Wed, 14 Aug 2002 06:18:19 -0400

On Tue, Aug 13, 2002 at 03:45:29PM -0400, Michael McLay wrote:
> > So what I wonder is this:  Has there been much thought about making
> > these type categories more explicitly part of the type system?
> 
> The category names look like general purpose interface names. The addition of 
> interfaces has been discussed quite a bit. While many people are interested 
> in having interfaces added to Python, there are many design issues that will 
> have to be resolved before it happens. 

Nope. Type categories are fundamentally different from interfaces.  An 
interface must be declared by the type while a category can be an 
observation about an existing type. 

Two types that are defined independently in different libraries may in 
fact fit under the same category because they implement the same protocol.
With named interfaces they may in fact be compatible but they will not 
expose the same explicit interface. Requiring them to import the interface 
from a common source starts to sound more like Java than Python and would
introduce dependencies and interface version issues in a language that is 
wonderfully free from such arbitrary complexities.

Python is a dymanic language. It deserves a dynamic type category system,
not static interfaces that must be declared. It's fine to write a class and
somehow say "I intend this class to be in category X, please warn me if I 
write a method that will make it incompatible". But I don't want declarations 
to be a *requirement* for being considered compatible with a protocol. I 
have noticed that a lots of protocols are defined retroactively by 
observation of the behavior of existing code. There shoudln't be any need 
to go tag someone else's code as conforming to a protocol or put a wrapper
around it just to be able to use it.

A category is defined mathematically by a membership predicate. So what we
need for type categories is a system for writing predicates about types.

Standard Python expressions should not be used for defining a category
membership predicate. A Python expression is not a pure function. This
makes it impossible to cache the results of which type belongs to what
category for efficiency. Another problem is that many different expressions 
may be equivalent but if two independently defined categories use equivalent 
predicates they should *be* the same category.  They should be merged at 
runtime just like interned strings. 

About a year ago I worked on a system for predicates having a canonical 
representation for security applications. . While I was working on it I 
realized that it would be perfect for implementing a type category system
for Python. It would be useful at runtime for error detection and runtime
queries of protocols. It would also be useful at compile time for early
detection of some errors and possibly for optimization. By implementing
an optional strict mode the early error detection could be improved to the
point where it's effectively a static type system.

Just a quick example of the usefulness of canonical predicates: if I
calculate the intersection of two predicates and reduce it to canonical
form it will reduce to the FALSE predicate if no input will satisfy both
predicates. It will be equal to one of the predicate if it is contained
by the other.

I spent countless hours thinking about these issues, probably more than 
most people on this list... I think I have the foundation for a powerful 
yet unobtrusive type category system. Unfortunately it will take me some 
time to put it in writing and I don't have enough free time (who does?)

	Oren