[Types-sig] Proposal: "abstract" type checking

Robin Thomas robin.thomas@starmedia.net
Sat, 17 Mar 2001 14:45:17 -0500


I'm not knowledgeable enough to weigh in on static typing, the type/class 
dichotomy in CPython, etc. But I would like to propose the following change 
to (C)Python. If this proposal is redundant, please tell me. FWIW, I did 
research of existing stuff before posting this proposal.

"ABSTRACT" TYPE CHECKING

When you do this in Python:

 >>> type(obj)

the built-in function type() returns a type object, which can be compared 
with other type objects (CPython compares type objects by object identity). 
As of 1.5.2 (?), the built-in function isinstance()  accepts, as a 
convenience, a type object as its second argument, such that:

 >>> import types
 >>> isinstance(2, types.IntType)
1

I often find myself doing the following in Python. This is a synthetic 
example, not production code:

def myfunc(x):
     t = type(x)
     if t in (ListType, TupleType):
         do_this_for_a_list_or_tuple(x)
     elif isinstance(x, DictType):
         do_this_for_a_dict(x)
     else:
         raise TypeError, "x must be list, tuple, or dict, you passed: %s" 
% repr(x)

When I do this, I often don't care whether x really is a concrete list, 
tuple, or dictionary object. I only care that it supports some or all of 
the sequence or mapping protocols.

The above works well, until I find myself wanting to define a 
dictionary-like class that will print debugging info, generate events, or 
something else whenever certain operations are performed on its instances:

# abridged example
class MyDict(UserDict.UserDict):
     def __setitem__(self, item, value):
         do_something_first(item, value)
         UserDict.UserDict.__setitem__(self, item, value)


Now I cannot pass instances of MyDict to myfunc() without it raising a 
TypeError. At this point, I consider the following solutions:

1) Don't do run-time type checking in myfunc(). Just let the code blow up 
when it tries to operate on the object in a way it doesn't support.

This would be acceptable if myfunc merely did type-checking as a "sanity 
check" on arguments. However, myfunc() also makes decisions based on the 
type of the argument.

I also am distributing myfunc() to less experienced Python users. I don't 
want them to see

AttributeError: has_key

or something similar as their error, especially from a frame way down in 
the stack from myfunc(). I want them to see

TypeError: x must be a list, tuple, or dictionary, you passed: 42

from the frame associated with myfunc(). This kind of "educational" error 
reporting is very important to my application and to our development efforts.

2) Don't check argument types; use hasattr() on the argument object to see 
if it supports the right protocols/operations.

Nice try, but some types (like tuples) don't have methods or other 
meaningful attributes. Some of the built-in functions help, like len(), but 
in myfunc() I couldn't decide whether to use x as a list/tuple or a dict 
because all three object types have length.

3) Replace the built-in functions type() and isinstance() to allow an 
object to intercept a type check and claim to be of a type different than 
its concrete type.

This works very, very well. I don't have to change any of my existing code, 
unless some of my Python code really demands that an object be of a certain 
concrete type. (I haven't found this case in real life yet,and if I did, 
the worst that could happen is that Python would generate a traceback, 
which is just fine and better than what happens with the previous 
alternatives. Working code is below.

# abstract_type.py

import __builtin__, types

__type_concrete = __builtin__.type
__isinstance_concrete = __builtin__.isinstance

# this should probably be implemented as a builtin type,
# and its concrete type listed in the types module
# (but not __builtin__).

class TypeDelegate:
     def __init__(self, realtype, dispatch):
         if __type_concrete(realtype) != types.TypeType:
             raise TypeError, "realtype must be type object"
         if __type_concrete(dispatch) not in (types.MethodType, 
types.FunctionType):
             raise TypeError, "dispatch must be method or function"
         self.realtype = realtype
         self.dispatch = dispatch

     def __cmp__(self, other):
         return self.realtype == other or self.dispatch(other)

     __rcmp__ = __cmp__

def type(obj, concrete=0):
     t = __type_concrete(obj)
     if concrete: return t
     # only concrete instances may implement
     # a special abstract type-checking method;
     # the exception approach can be replaced
     # but is very concise
     elif t == types.InstanceType:
         try:
             return TypeDelegate(types.InstanceType, t.__type__)
         except:
             pass
     return t

def isinstance(obj, classobj, concrete=0):
     result = __isinstance_concrete(obj, classobj)
     if concrete: return result
     # abstract type check of classobj important;
     # used-defined "type objects" need to be supported
     if type(classobj) == types.TypeType:
         return type(obj) == classobj
     return result

__builtin__.type = type
__builtin__.isinstance = isinstance

# end of abstract_type.py

NOTES AND PROPAGANDA

- Instances which emulate types will pass run-time type checking tests done 
with type() and the type-checking behavior of isinstance().

- You can create your own "type objects" to represent certain sets of 
functionality: subsets of object protocols (e.g. SliceableType, 
ReadOnlyDictType), :

from types import *

# maybe this too could be a builtin type,
# but it doesn't have to be
class Type:
     def __init__(self, types=None):
         if types: self.types = types
         else: self.types = None

     def __type__(self, type):
         return type == TypeType

     def __cmp__(self, other):
         t = self.types
         if t: return other in t
         else: return self is other

NonStringlikeSequenceType = Type( (TupleType, ListType) )
SliceableType = Type()
CanCallMethodFooType = Type()
PersistableType = Type()

- No code breakage: type(obj) still works. Anybody who really needs to know 
the *concrete* type of an object from Python can do type(obj, 1). Whether 
checking concrete type *in Python code* is a common case or (as I suspect) 
a very rare case is up for discussion.

- Concrete type-checking at the C layer is completely unaffected.

- Performance cost seems negligible. Your findings and opinions on this 
issue are welcome.

- A logical extension of this proposal is to add a type-checking function 
at the abstract object layer, either PyObject *PyObject_GetType(PyObject 
*o) or int PyObject_CheckType(PyObject *o, PyObject *t). Abstract type 
comparison is really just an object comparison with a special, conventional 
meaning.

- This proposal makes no strident claims regarding the Right Way to do 
"interfaces", resolve the class/type dichotomy, emulate types, or anything 
else. It doesn't just fail to address the class/type dichotomy; it actually 
respects it. It merely provides a standard hook for emulating object typing 
in an abstract sense, just as all of the special __methods__ allow concrete 
instance objects to assert emulation of numbers, sequences, and various 
object operations in an abstract way.

- Individuals may explore different typing strategies without modifying or 
extending Python itself, and can put that code in production environments. 
Then they can *distribute* that code to others, and the new built-in 
abstract typing provides a standard way to conform to type-checking 
conventions. If you disagree with others about how typing can/should be used

- Optionally, the old type and isinstance functions may be preserved in the 
builtin module under different, "preserved" names, a la __stdout__ for 
stdout in the sys module. (Allows brute-force backward compatibility in the 
rare case that your Python code is full of concrete type checks.)

- New or changed code is very minimal. The full implementation adds two 
functions to bltinmodule.c (the replacements for type and isinstance), and 
one (or two) functions to Objects/abstract.c (for the abstract type 
checking functions for the C API).

- Another way to bypass the "abstract typing hook" is to use the "is" 
operator for type object comparison. "type(obj) is types.DictType" 
effectively thwarts any attempts at abstract typing. One sticky example is 
"type(obj) is types.InstanceType" when obj is an instance with a __type__ 
method. Ack! Your feedback on this issue is greatly appreciated!

- It seems like the idiom "type(obj) == some_type" is deprecated. That's 
why isinstance supports type checking, why the abstract object layer has 
PySequence_Check et al. I get the impression that type checking is 
encouraged as an operation on two objects, object and typeobject. If that's 
true, the above changes to type() and the TypeDelegate behavior are for 
backward compatibility with the idiom "type(obj) == some_type".


--
Robin Thomas
Engineering
StarMedia Network, Inc.
robin.thomas@starmedia.net