On Wed, Aug 14, 2002 at 09:09:19AM -0400, Guido van Rossum wrote: ...
Now I think you've lost me. How can a category on the one hand be ... Again you've lost me. I expect there's something here that you assume ...
Oh dear. Here we go again. I'm afraid that it may take several frustrating iterations just to get our terminology and assumptions in sync and be able to start talking about the actual issues.
Type categories are fundamentally different from interfaces. An interface must be declared by the type while a category can be an observation about an existing type.
Yup. (In Python these have often been called "protocols". Jim Fulton calls them "lore protocols".)
Nope. For me protocols are conventions to follow for performing a certain task. A type category is a formally defined set of types. For example, the 'iterable' protocol defines conventions for a programmer to follow for doing iteration. The 'iterable' category is a set defined by the membership predicate "hasattr(t, '__iter__')". The types in the 'iterable' category presumably conform to the 'iterable' protocol so there is a mapping between protocols and type categories but it's not quite 1:1. Protocols live in documentation and lore. Type categories live in the same place where vector spaces and other formal systems live.
Two types that are defined independently in different libraries may in fact fit under the same category because they implement the same protocol. With named interfaces they may in fact be compatible but they will not expose the same explicit interface. Requiring them to import the interface from a common source starts to sound more like Java than Python and would introduce dependencies and interface version issues in a language that is wonderfully free from such arbitrary complexities.
Hm, I'm not sure if you can solve the version incompatibility problem by ignoring it. :-)
Oops, I meant interface version *numbers*, not interface versions. A version number is a unidimentional entity. Variations on protocols and subprotocols have many dimensions. I find that set theory ("an object that has a method called foo and another method called bar") works better than arithmetic ("an object with version number 2.13 of interface voom").
Are you familiar with Zope's Interface package? It solves this problem (nicely, IMO) by allowing you to place an interface declaration inside a class but also allowing you to make calls to an interface registry that declare interfaces for pre-existing classes.
I don't like the bureacracy of declaring interfaces and maintaining registeries. I like the ad-hoc nature of Python protocols and I want a type system that gives me the tools to use it better, not replace it with something more formal.
A category is defined mathematically by a membership predicate. So what we need for type categories is a system for writing predicates about types.
Now I think you've lost me. How can a category on the one hand be observed after the fact and on the other hand defined by a rigorous mathematical definition? How could a program tell by looking at a class whether it really is an implementation of a given protocol?
A category is defined mathematically. A protocol is a somewhat more fuzzy meatspace concept. A protocol can be associated with a category with reasonable accuracy so the result of a set operation on categories is reasonably applicable to the associated protocols. Even a human can't always tell whether a class is *really* an implmentation of a given protocol. But many protocols can be inferred with pretty good accuracy from the presence of methods or members. You can always add a member as a flag indicating compliance with a certain protocol if that is not enough. My basic assumption is that programmers are fundamentally lazy. It hasn't ever failed me so far. This way there is no need to declare all the protocols a class conforms to. This is important since in many cases the protocol is only "discovered" later. The user of the class knows what protocol is expected and only needs to declare that. It should reduces the tendency to use relatively coarse-grained "fat" interfaces because there is not need to declare every minor protocol the type conforms to - it may observed by users of this type using a type category.
Standard Python expressions should not be used for defining a category membership predicate. A Python expression is not a pure function. This makes it impossible to cache the results of which type belongs to what category for efficiency. Another problem is that many different expressions may be equivalent but if two independently defined categories use equivalent predicates they should *be* the same category. They should be merged at runtime just like interned strings.
Again you've lost me. I expect there's something here that you assume well-known. Can you please clarify this? What on earth do you mean by "A Python expression is not a pure function" ?
A function whose result depends only on its inputs and has no side effects. In this case I would add "and can be evaluated without triggering any Python code". Set operations on membership predicates, caching and other optimizations need such guarantees. Oren