[Python-3000] An introduction to ABC's

Tue Apr 17 09:18:05 CEST 2007

Aurélien Campéas wrote:
 > On Sat, Apr 14, 2007 at 10:57:28AM -0700, Talin wrote:
 >> Part of the reason why I haven't volunteered to write a PEP for 
ABC's is because I don't feel that I understand the various proposals 
and the background discussion well enough. However, it occurs to me that 
writing the rationale section of a PEP can often be the hardest part, 
and I think I understand the issues well enough to write that preface. 
So here's my contribution:
 >>
 >> ---
 >>
 >> In the domain of object-oriented programming, the usage patterns for 
interacting with an object can be divided into two basic categories, 
which are 'invocation' and 'inspection'.
 >
 > I'd put 'definition' there. Especially in a language like Python
 > (dynamic isn't it ?), where additional definitions can be introduced
 > at run-time, or existing definitions be subtituted to existing ones
 > (aka 'monkey-patching'). At least where I work, this is an important
 > category. Java/C++ monkeys can't really understand the importance of
 > this, tough.
 >
 >> Invocation means interacting with an object by invoking its methods. 
Usually this is combined with polymorphism, so that invoking a given 
method may run different code depending on the type of an object.
 >>
 >> Inspection means the ability for external code (outside of the 
object's methods) to examine the type or properties of that object, and 
make decisions on how to treat that object based on that information.
 >>
 >> Both usage patterns serve the same general end, which is to be able 
to support the processing of diverse and potentially novel objects in a 
uniform way, but at the same time allowing processing decisions to be 
customized for each different type of object.
 >>
 >> In classical OOP theory, invocation is the preferred usage pattern,
 >> and
 >
 > uh, well, but what is classical OOP theory ? What is defined by
 > SmallTalk, C++ and Java ? Feh !

Since my knowledge of programming is almost entirely self-taught, I 
don't have much experience as to what are the formalisms that are being 
taught to young programmers today. However, based on my interactions 
with programmers who have gone through a formal CS education, I can 
infer the existence of a "school of thought" of object decomposition and 
software architecture that follows rules very similar to what I have 
described. What I lack, however, is bibliographic references for it.

 >> inspection is actively discouraged, being considered a relic of an 
earlier, procedural programming style. However, in practice this view is 
simply too dogmatic and inflexible, and leads to a kind of design 
rigidity that is very much at odds with the dynamic nature of a language 
like Python.
 >
 > imho it's not a too dogmatic & inflexible view (it is quite correct
 > that inspection-based dispatch, aka duck-typing (I call it
 > {L,S,F}uck-typing sometimes, in anger ...), is an evil hack that
 > should avoided from production software, for obious maintenance
 > problems) ; but it is very useful in the context of rapid, interactive
 > developement, when one has to explore the shape of things to come ...

I think it's important to understand that 'duck-typing' is not the same 
as inspection-based dispatch - it is an entirely orthogonal concept. 
'Duck-typing' simply means that two classes can be polymorphic without 
having a common base class or interface - all they have to do is support 
the same methods. If I want to create a new kind of widget, I don't have 
to inherit from the base Widget class, all I have to do is make sure 
that my custom widget class supports all of the proper Widget methods. 
This is not possible in Java or C++.

Duck typing can apply to both invocation ('use this object as a widget') 
and inspection ('is this object a widget?'). That is why I claim it is 
orthogonal to the issue of inspection.

And I don't think that inspection-based dispatch is an evil hack, but it 
certainly can be misused. Inspection-based dispatch can be something as 
simple as "if this is a list, do A, else do B". The evil comes when you 
use it in a way that lacks discipline - what you might call 'spaghetti 
inspection'. But it's hard to use proper discipline, because we haven't 
yet defined what the proper discipline *is* - we haven't set the 
standard. That is what the ABCs stuff is attempting to establish - to 
make it possible to use inspection in a way that is 'not evil' because 
it conforms to a well-understood pattern.

 >> In particular, there is often a need to process objects in a way 
that wasn't anticipated by the creator of the object class. It is not 
always the best solution to build in to every object methods that satisfy
 >> the
 >
 > it is just not possible
 >
 >> needs of every possible user of that object. Moreover, there are 
many powerful dispatch philosophies that are in direct contrast to the 
classic OOP requirement of behavior being strictly encapsulated
 >> within
 >
 > 'classic OOP requirement' is clearly the dogma there :)
 > Python developpers should never feel obliged towards it ; especially
 > since Python owes (almost) nothing to this dogma.
 >
 >> an object, examples being rule or pattern-match driven logic.
 >>
 >> On the the other hand, one of the criticisms of inspection by 
classic OOP theorists is the lack of formalisms and the ad hoc nature of 
what is being inspected. In a language such as Python, in which almost any
 >
 > The school of static verification of simple program properties will
 > never cease to come after us, programmers, to try to sell its highly
 > unpractical theoretical belief about ways to ensure programs
 > correctness ; let's not fall in the traps of their evil ways^H^H^H
 > hidden academic agenda ...
 >
 >> aspect of an object can be reflected and directly accessed by 
external code, there are many different ways to test whether an object 
conforms to a particular protocol or not. For example, if asking 'is 
this object a mutable sequence container?', one can look for a base 
class of 'list', or one can look for a method named '__getitem__'. But 
note that although these tests may seem obvious, neither of them are 
correct, as one generates false negatives, and the other false positives.
 >>
 >> The generally agreed-upon remedy is to standardize the tests, and 
group them into a formal arrangement. This is most easily done by 
associating with each class a set of standard testable properties, 
either via the inheritance mechanism or some other means. Each test 
carries with it a set of promises: it contains a promise about the 
general behavior of the class, and a promise as to what other class 
methods will be available.
 >>
 >> This PEP proposes a particular strategy for organizing these tests 
known as Abstract Base Classes, or ABC. ABCs are simply Python classes 
that are added into an object's inheritance tree to signal certain 
features of that object to an external inspector. Tests are done using 
isinstance(), and the presence of a particular ABC means that the test 
has passed.
 >>
 >> Like all other things in Python, these promises are in the nature of 
a gentlemen's agreement - which means that the language does not attempt 
to enforce that these promises are kept.
 >
 > bleh
 >
 > What's wrong with associating behaviour to methods ? Currently, the
 > wrongness lies in the strong coupling between class definition and
 > method definition. Hence you (also in the name of others) propose an
 > ABC, for instance, in oder to assess that object conforms to a
 > protocol. But suppose you work with a class whose definition you don't
 > control (because it's a piece of Plone/Archetype for instance) and you
 > want to make it indexable. With ABC, you'd need to basically add a
 > super class to add behaviour, at runtime, monkey-patching things up
 > till you make it work, most of the time. This is a terrible thing to
 > do, even in Python. Or maybe I miss some wonderful well_known recipe
 > about it ...
 >
 > Let's use generic functions instead of ABC's, for god's sake !
 > Need to define a common protocol for item access in an object :

There's nothing in what I said that prevents the use of generic 
functions. Part of the reason for defining the ABC's is to give the 
generic functions something to operate on.

If you look at the research material that's been done on generic 
functions, you'll notice that there's a generalization of generic 
functions called "predicated-based dispatch". That's simply a fancy way 
of saying that we do some tests on the calling arguments, and depending 
on how those tests come out we decide which actual piece of code to run.

A subset of predicate-based dispatch is called "type-based dispatch" -- 
what most people think of as generic functions. Type based dispatch is a 
specialized form of predicate-based dispatch where all of the tests are 
type tests, i.e. "if the first argument is of type A, then use this code..."

The point is, regardless of whether you are using the full 
predicate-based dispatch or the more specialized type-based dispatch, 
the generic function dispatch is nothing more than a series of boolean 
tests. In many cases, we can optimize the tests by creating lookup 
tables and such, compressing multiple tests into a single jump 
calculation so that they may not "look like" sequential boolean tests, 
but nevertheless that's what they are.

The controversy over ABCs is really a controversy over what kinds of 
tests we should use for dispatching. Should the tests be limited to 
"is-a" (isinstance) tests, or should we also allow "has-a" (hasattr) 
tests as well?

But generic functions can easily use either kind of test, or any other 
kind of boolean test for that matter. So nothing about ABCs is 
incompatible with the use of generic functions.

Now, you bring up the point that in some cases, you might want to use 
tests in a way that allows you to override the classification of an 
object. In other words, the object itself may have some "native" 
classification (e.g. 'this is a list, this is a string') that is 
inherent in its construction, but instead you want to might want to 
impose a different set of categories (e.g. 'this is a 
marshall-by-reference, this is a marshall-by-value'), so that objects 
which are unrelated get grouped together, and vice versa.

It seems to me that this is a good argument to have - it's really the 
crux of much of the previous discussion - but it will need more concrete 
examples and use cases.

You see, even if this "override" feature isn't supported in the ABC 
system, there's nothing to prevent someone from adding it to the generic 
dispatch mechanism. Generic functions need not be limited to only using 
'isinstance' tests and need not be limited to only using ABCs. An 
independent taxonomy of types and categories can be built using an 
entirely parallel data structure, without touching the original classes 
at all, and dispatching can be based on that.

The real question to ask is - what is the 90% solution? In other words, 
what is the simplest mechanism that covers 90% of the use cases, while 
allowing the other 10% of use cases to handled by allowing the 
application programmer to extend the system or add a layer on top of it?

Here's how the debate should go: Various proponents will submit designs 
for an object inspection/classification system that cover those 90% of 
cases. Ideally, the simplest design will be used as the starting point. 
Various people will then put forward use cases that aren't included by 
the candidate design. However, merely putting forward a use case isn't 
enough, one has to make a convincing argument that such a use case is 
significant - meaning that it is likely to be commonly occurring, and 
has no easy workaround.

Thus, if someone has a use case that requires that objects be 
dynamically reclassified at runtime, they should be prepared to show 
that (a) its a common use case, and (b) there's no obvious way to do it 
other than to build that feature into the system from the beginning.

 > use a 'getitem' generic function ...
 >
 > Need to implement it for classes you didn't write (or even for your
 > own) :
 > @method(FooWidget, int)
 > def getitem(widget, pos):
 >     # just do it in a FooWidget-y- way
 >
 > Well, I just don't dare post this on the py3k list ... I also just
 > don't get why people are so reluctant to understand what generic
 > functions are, and how they would solve all these non-problems :)
 >
 > Of course, for py3k there should be some bridge between old-style
 > generic functions (__len__, __getitem__, ...) and something more
 > general ; something that, above all, DECOUPLES method definition from
 > class definition. We already know that Python is not about
 > encapsulation (in the classical dogmatic way), but merely about puting
 > stuff into namespaces. So let's drop entirely the pseudo encapsulation
 > thing, and let's go with the one truly modular way. Generic functions
 > are namespaces, of course, and can be further put into any modules or
 > namespace kind you want to.
 >
 > Cheers,
 > Aurélien.
 >
 > PS : note how such a getitem gf could generalize also to
 > hashtable/dict-like access, with something like :
 >
 > @method(dict, str):
 > def getitem(adict, astr):
 >     # just do it ...
 >
 > Double dispatch is not just a fancy complicated feature. It allows to
 > build much more leagible, maintainable code. Ok that might be a little
 > harder for the folks that maintain the language runtime. But these are
 > supposed the real men, not the day-to-day users that currently
 > struggle with a broken 'object model'.
 >
 > Wich is broken because inspired by the aforementioned, mindless,
 > dogmas. Too many people in Python-land have been exposed to Java, alas
 > ...
 >
 > hum, I stop now.

By the way, there's something odd about the "reply-to" headers on your 
email, when I hit "reply all", it replied only to you, not the list.

-- Talin