[Types-sig] Proposal: "abstract" type checking
Robin Thomas
robin.thomas@starmedia.net
Sat, 17 Mar 2001 14:45:17 -0500
I'm not knowledgeable enough to weigh in on static typing, the type/class
dichotomy in CPython, etc. But I would like to propose the following change
to (C)Python. If this proposal is redundant, please tell me. FWIW, I did
research of existing stuff before posting this proposal.
"ABSTRACT" TYPE CHECKING
When you do this in Python:
>>> type(obj)
the built-in function type() returns a type object, which can be compared
with other type objects (CPython compares type objects by object identity).
As of 1.5.2 (?), the built-in function isinstance() accepts, as a
convenience, a type object as its second argument, such that:
>>> import types
>>> isinstance(2, types.IntType)
1
I often find myself doing the following in Python. This is a synthetic
example, not production code:
def myfunc(x):
t = type(x)
if t in (ListType, TupleType):
do_this_for_a_list_or_tuple(x)
elif isinstance(x, DictType):
do_this_for_a_dict(x)
else:
raise TypeError, "x must be list, tuple, or dict, you passed: %s"
% repr(x)
When I do this, I often don't care whether x really is a concrete list,
tuple, or dictionary object. I only care that it supports some or all of
the sequence or mapping protocols.
The above works well, until I find myself wanting to define a
dictionary-like class that will print debugging info, generate events, or
something else whenever certain operations are performed on its instances:
# abridged example
class MyDict(UserDict.UserDict):
def __setitem__(self, item, value):
do_something_first(item, value)
UserDict.UserDict.__setitem__(self, item, value)
Now I cannot pass instances of MyDict to myfunc() without it raising a
TypeError. At this point, I consider the following solutions:
1) Don't do run-time type checking in myfunc(). Just let the code blow up
when it tries to operate on the object in a way it doesn't support.
This would be acceptable if myfunc merely did type-checking as a "sanity
check" on arguments. However, myfunc() also makes decisions based on the
type of the argument.
I also am distributing myfunc() to less experienced Python users. I don't
want them to see
AttributeError: has_key
or something similar as their error, especially from a frame way down in
the stack from myfunc(). I want them to see
TypeError: x must be a list, tuple, or dictionary, you passed: 42
from the frame associated with myfunc(). This kind of "educational" error
reporting is very important to my application and to our development efforts.
2) Don't check argument types; use hasattr() on the argument object to see
if it supports the right protocols/operations.
Nice try, but some types (like tuples) don't have methods or other
meaningful attributes. Some of the built-in functions help, like len(), but
in myfunc() I couldn't decide whether to use x as a list/tuple or a dict
because all three object types have length.
3) Replace the built-in functions type() and isinstance() to allow an
object to intercept a type check and claim to be of a type different than
its concrete type.
This works very, very well. I don't have to change any of my existing code,
unless some of my Python code really demands that an object be of a certain
concrete type. (I haven't found this case in real life yet,and if I did,
the worst that could happen is that Python would generate a traceback,
which is just fine and better than what happens with the previous
alternatives. Working code is below.
# abstract_type.py
import __builtin__, types
__type_concrete = __builtin__.type
__isinstance_concrete = __builtin__.isinstance
# this should probably be implemented as a builtin type,
# and its concrete type listed in the types module
# (but not __builtin__).
class TypeDelegate:
def __init__(self, realtype, dispatch):
if __type_concrete(realtype) != types.TypeType:
raise TypeError, "realtype must be type object"
if __type_concrete(dispatch) not in (types.MethodType,
types.FunctionType):
raise TypeError, "dispatch must be method or function"
self.realtype = realtype
self.dispatch = dispatch
def __cmp__(self, other):
return self.realtype == other or self.dispatch(other)
__rcmp__ = __cmp__
def type(obj, concrete=0):
t = __type_concrete(obj)
if concrete: return t
# only concrete instances may implement
# a special abstract type-checking method;
# the exception approach can be replaced
# but is very concise
elif t == types.InstanceType:
try:
return TypeDelegate(types.InstanceType, t.__type__)
except:
pass
return t
def isinstance(obj, classobj, concrete=0):
result = __isinstance_concrete(obj, classobj)
if concrete: return result
# abstract type check of classobj important;
# used-defined "type objects" need to be supported
if type(classobj) == types.TypeType:
return type(obj) == classobj
return result
__builtin__.type = type
__builtin__.isinstance = isinstance
# end of abstract_type.py
NOTES AND PROPAGANDA
- Instances which emulate types will pass run-time type checking tests done
with type() and the type-checking behavior of isinstance().
- You can create your own "type objects" to represent certain sets of
functionality: subsets of object protocols (e.g. SliceableType,
ReadOnlyDictType), :
from types import *
# maybe this too could be a builtin type,
# but it doesn't have to be
class Type:
def __init__(self, types=None):
if types: self.types = types
else: self.types = None
def __type__(self, type):
return type == TypeType
def __cmp__(self, other):
t = self.types
if t: return other in t
else: return self is other
NonStringlikeSequenceType = Type( (TupleType, ListType) )
SliceableType = Type()
CanCallMethodFooType = Type()
PersistableType = Type()
- No code breakage: type(obj) still works. Anybody who really needs to know
the *concrete* type of an object from Python can do type(obj, 1). Whether
checking concrete type *in Python code* is a common case or (as I suspect)
a very rare case is up for discussion.
- Concrete type-checking at the C layer is completely unaffected.
- Performance cost seems negligible. Your findings and opinions on this
issue are welcome.
- A logical extension of this proposal is to add a type-checking function
at the abstract object layer, either PyObject *PyObject_GetType(PyObject
*o) or int PyObject_CheckType(PyObject *o, PyObject *t). Abstract type
comparison is really just an object comparison with a special, conventional
meaning.
- This proposal makes no strident claims regarding the Right Way to do
"interfaces", resolve the class/type dichotomy, emulate types, or anything
else. It doesn't just fail to address the class/type dichotomy; it actually
respects it. It merely provides a standard hook for emulating object typing
in an abstract sense, just as all of the special __methods__ allow concrete
instance objects to assert emulation of numbers, sequences, and various
object operations in an abstract way.
- Individuals may explore different typing strategies without modifying or
extending Python itself, and can put that code in production environments.
Then they can *distribute* that code to others, and the new built-in
abstract typing provides a standard way to conform to type-checking
conventions. If you disagree with others about how typing can/should be used
- Optionally, the old type and isinstance functions may be preserved in the
builtin module under different, "preserved" names, a la __stdout__ for
stdout in the sys module. (Allows brute-force backward compatibility in the
rare case that your Python code is full of concrete type checks.)
- New or changed code is very minimal. The full implementation adds two
functions to bltinmodule.c (the replacements for type and isinstance), and
one (or two) functions to Objects/abstract.c (for the abstract type
checking functions for the C API).
- Another way to bypass the "abstract typing hook" is to use the "is"
operator for type object comparison. "type(obj) is types.DictType"
effectively thwarts any attempts at abstract typing. One sticky example is
"type(obj) is types.InstanceType" when obj is an instance with a __type__
method. Ack! Your feedback on this issue is greatly appreciated!
- It seems like the idiom "type(obj) == some_type" is deprecated. That's
why isinstance supports type checking, why the abstract object layer has
PySequence_Check et al. I get the impression that type checking is
encouraged as an operation on two objects, object and typeobject. If that's
true, the above changes to type() and the TypeDelegate behavior are for
backward compatibility with the idiom "type(obj) == some_type".
--
Robin Thomas
Engineering
StarMedia Network, Inc.
robin.thomas@starmedia.net