[Python-Dev] type categories

Oren Tirosh oren-py-d@hishome.net
Thu, 15 Aug 2002 09:30:31 +0300

On Wed, Aug 14, 2002 at 09:38:18PM -0400, Andrew Koenig wrote:
> >> Not really.  I can see how an interface can claim that a particular
> >> method exists, but not how it can claim that the method implements a
> >> function that is antisymmetric and transitive.
> Guido> That's done in the docs, usually.  Zope even has the notion of a
> Guido> "marker" interface -- an interface that says "this object has property
> Guido> such-and-such" but which does not assert any methods or attributes.
> So perhaps what I mean by a category is the set of all types that
> implement a particular marker interface.

I propose that any method or attribute may serve as a marker. This makes it
possible to use an existing practice as a marker so protocols can be
defined retroactively for an existing code base. It's also possible, of 
course, to add an attribute called 'has_property_such_and_such' to serve 
as an explicit marker.

A type category is defined by a predicate that tests for the presence of
one or more markers.  Predicates can test not only for the presence of
markers but also for the type category of the marker object and for call 
signatures. When optional type checking is implemented they should also be
able to test for the categories of arguments and return values.

A new category may be defined as a union or intersection of two existing
categories. This is done by ANDing or ORing the membership predicates of
the two categories and reducing them back to canonical form. Canonizing
a predicate is done by conversion into Disjnctive Normal Form, elimination 
of redundant terms and products, sorting and a few other steps.

A global dictionary of canonical predicates is kept (similar to interning
of strings) so any equivalent categories are merged. Each type object
can store a cache of categories in which it is a member so evaluation of
a membership predicate only needs to be done once for each type.

This may sound complicated by here's how it might work in practice:

Extracting a category from an existing class:
foobarlike = like(FooBar)

The members of the foobarlike category are any classes that implement the
same methods and attributes as FooBar, whether or not they are actually
descended from it. They may be defined independently in another library.
FooBar may be an abstract class used just as a template for a category.

Asserting that a class must be a member of a category:

class SomeClass:
   __category__ = like(AnotherClass)

At the end of the class definition it will be checked whether it really is
a member of that category (like(SomeClass) issubsetof like(AnotherClass))
This attribute is inherited by subclasses.  Any subclass of this class
will be checked whether it is still a member of the category.  A subclass
may also override this attribute:

class InheritImplementationButNotTheCategoryCheckFrom(SomeClass):
   __category__ = some_other_category

class AddAdditionalRestrictionsTo(SomeClass):
   __category__ = __category__ & like(YetAnotherClass)

If there is a conflict between the two categories the new category will
reduce to the empty set and an error will be generated. The error can be
quite informative by extracting a category from the new class, subtracting
it from the defined category and printing the difference.

When a backward compatible change is made to a protocol (e.g. adding a new
method) any modules that use the old category should still work because
the new category is a subcategory of the old one. When a non backward
compatible change is made (e.g. removing a method, changing its call
signature) existing code may still run without complaining depending on
the category it uses to do the checking. If it's a wider category that
doesn't check for the method it should be ok.

A non backward compatible change must change the exposed interface. This
may be ensured by adding an attribute or method that serves as an explicit
marker and includes a version number or is renamed in some other way when
making incompatible changes. Category union may be used to check for two
incompatible versions that are known to implement a common subset even if
it has never been given a name, etc.