[Python-3000] Adaptation [was:Re: Iterators for dict keys, values, and items == annoying :)]
Alex Martelli
aleaxit at gmail.com
Sat Apr 1 20:50:51 CEST 2006
On Apr 1, 2006, at 8:31 AM, Aahz wrote:
...
> Seriously, I can almost see why you think adaptation is a huge
> gain, but
> every time I start looking closer, I get bogged down in trying to
> understand adaptation registration. Do you have a simple way of
> explaining how that works *well* and *simply*? Because unless one can
> handle registration cleanly, I don't understand how adaptation can be
> generally useful for Python. Conversely, if adaptation
> registration is
> *NOT* required, please explain that in simple terms.
Yes, the ability to register adapters is indeed necessary to gain the
full benefits of "protocol adaptation". Let's see if I'm up to
explaining things a bit more clearly than I've managed in the past,
by "abstracting out" just adaptation and registration from the many
cognate issues (you can find more real-world approaches in Zope,
Twisted or PyProtocols, but perhaps I can better show the underlying
idea by sidestepping practical concerns such as performance
optimization and effective error-checking).
Suppose, just for simplicity and to separate this issue from issues
related to "how should interfaces or protocols be formalized", that
we choose, as the one and only way to identify each protocol, a
unique-string, such as the half-hearted 'com.python.stdlib.index'
example I mentioned earlier. It's probably an oversimplification --
it entirely leaves the issue of defining what syntactic, semantic,
and pragmatic assurances a protocol makes up to human-readable
documentation, and attempts no enforcement of such promises; the
reason I posit this hypothesis now is to sidestep the long,
interesting and difficult discussion of that issue, and focus on
adaptation and registration.
Second simplification: let's ignore inheritance. Again, I'm not
advocating this as a pratical approach: the natural choice, when
looking for an adaptation of type T to protocol P, is instead, no
doubt, to walk T's MRO, looking for the adaptation of each of T's
bases, and stop when the first one is met, and this has many
advantages (but also some issues, partly due to the fact that
sometimes inheritance is used more for implementation purposes than
for Liskovian purity, etc, etc). In the following, however, I'll
just assume that "adaptation is not inherited" -- just as a
simplification. Similarly, I will assume "no transitivity":
adaptation in the following is assume to go from a type T to a
protocol P (the latter identified by a unique string), in a single
step which either succeeds or fails, period. The advantages of
transitivity, like those of inheritance, are no doubt many, but they
can be discussed separately. I do not even consider the possibility
of having an inheritance structure among protocols, even though it
might be very handy; etc, etc.
So, each entity which we can call a "registration of adaptation" (ROA
for short) is a tuple ( (T, P), A) where:
T is a type;
P is a unique string identifying a protocol;
A is a callable, such that, for any direct instance t of T
(i.e., one such that type(t) is T), A(t) returns an object (often t
itself, or a wrapper over t) which claims "I satisfy all the
constraints which, together, make up protocol P". A(t) may also
raise some exception, in which case the adaptation attempt fails and
the exception propagates.
For example, A may often be the identity function:
def identity(x):
return x
where a ROA of ((T, P), identity) means "all direct instances of T
satisfy protocol P without need for wrappers or whatever".
The adaptation registry is a mapping from (T, P) to A, where each ROA
is an item -- (T, P) the key, A the value. The natural
implementation would of course be a simple dict.
Consider for example the hypothetical protocol P
'com.python.stdlib.index'. We could define it to mean: an object o
satisfies P iff int(o) is directly usable as index into a sequence,
with no loss of information.
So, a sequence type's __getitem__ would be...:
def __getitem__(self, index):
adapted_index = adapt(index, 'com.python.stdlib.index')
i = int(adapted_index)
# continue indexing using the int 'i', which is asserted to
be the correct index equivalent of argument 'index'
instead of today's (roughly):
def __getitem__(self, index):
i = index.__index__()
# etc, as above
Whoever (python-dev, in this case;-) invents the protocol P, besides
using it in methods such as __getitem__, might also pre-register
adapters (often, identity) for the existing types which are known to
satisfy protocol P. I.e., during startup, it would execute:
register_adapter(int, 'com.python.stdlib.index', identity)
register_adapter(long, 'com.python.stdlib.index', identity)
This is similar to what the inventor would do today (inserting an
__index__, at Python level, or tp_index, at C level, into the types)
but not "invasive" of the types (which doesn't matter here, since the
protocol's inventor also owns the types in question).
Once the protocol P is published, the author of a type T which does
respect/satisfy P would also execute register_adapter calls (instead
of modifying T by adding __index__/tp_index). Say that instances of
T do already satisfy the protocol, then identity is the correct adapter:
register_adapter(T, 'com.python.stdlib.index', identity)
But say that some T1 doesn't quite satisfy the protocol but can
easily be adapted to it. For example, instances t of T1 might not
support int(t), or support it in a way that's not correct for the
protocol, but (again for example) T1 might offer an existing as_index
method that would do what's required. Then, a non-identity adapter
is needed:
class wrapT1(object):
def __init__(self, t):
self.t = t
def __int__(self):
return self.t.as_index()
register_adapter(T1, 'com.python.stdlib.index', wrapT1)
The big deal here is that protocol P and type T1 may have been
developed independently and separately, and yet a third-party author
can still author and register such a wrapT1 adapter, and as long as
that startup code has executed, the application will be able to use
T1 instances as indexes into sequences "transparently". The author
of the application code need not even be aware of the details: he or
she may just choose to import a third-party module
"implement_T1_adaptations.py" just as he or she imports the framework
supplying T1 and the one(s) using protocol P. This separation of
concerns into up to 4 groups of developers (authors, respectively,
of: a framework defining/using protocol P; a framework supplying type
T1; a framework adapting T1 to P; an application using all of the
above) always seems overblown for "toy" examples that are as simple
as this one, of course -- and even in the real world in many cases a
developer will be wearing more than one of these four hats. But
adaptation *allows* the separation of "ownership" concerns by
affording *non-invasive* operation.
Here is a simple reference implementation of adaptation under all of
these simplifying assumptions:
_global_registry = {}
def register_adapter(T, P, A, registry=_global_registry):
registry[T, P] = A
def adapt(t, P, registry=_global_registry):
return registry[type(t), P]
Now, a million enrichments and optimizations can obviously be
imagined and discussed: the many simplifying assumptions I've been
making to try to isolate adaptation and registration down to its
simplest core leave _ample_ space for a lot of that;-). But I hope
that instead of immediately focusing on (premature?-) optimizations,
and addition of functionality of many kinds, we can focus on the
salient points I've been trying to make:
a. protocols may be identified quite arbitrarily (e.g. by unique-
strings), though specific formalizations are also perfectly possible
and no doubt offer many advantages (partial error-checking, less
indirectness, ...): there is no strict need to formalize interfaces
or protocols in order to add protocol-adaptation to Python
b. the possibility of supplying, consuming and adapting-to protocols
becomes non-invasive thanks to registration
In this simplified outline I've supported *only* registration as the
one and only way to obtain adaptation -- conceptually, through the
identity function, registration can indeed serve the purpose,
although optimizations are quite obviously possible.
I should probably also mention what I mean by "protocol": it's a very
general term, potentially encompassing interfaces (sets of methods
with given signatures -- a "syntactic" level), design-by-contract
kinds of constraints (a "semantic" level), and also the fuzzier but
important kinds of constraints that linguists call "pragmatic" (for
example, the concept of "integer usable as an index into a sequence
without significant loss of information" is definitely a pragmatic
constraint, not formalizable as semantics).
Alex
More information about the Python-3000
mailing list