[Python-3000] Adaptation [was:Re: Iterators for dict keys, values, and items == annoying :)]

Brett Cannon brett at python.org
Sun Apr 2 00:40:00 CEST 2006


Woohoo!  I get it, finally!  Some comments below, but I suddenly feel
a little less stupid since I get the whole process now!  =)

On 4/1/06, Alex Martelli <aleaxit at gmail.com> wrote:
>
> On Apr 1, 2006, at 8:31 AM, Aahz wrote:
>     ...
> > Seriously, I can almost see why you think adaptation is a huge
> > gain, but
> > every time I start looking closer, I get bogged down in trying to
> > understand adaptation registration.  Do you have a simple way of
> > explaining how that works *well* and *simply*?  Because unless one can
> > handle registration cleanly, I don't understand how adaptation can be
> > generally useful for Python.  Conversely, if adaptation
> > registration is
> > *NOT* required, please explain that in simple terms.
>
> Yes, the ability to register adapters is indeed necessary to gain the
> full benefits of "protocol adaptation". Let's see if I'm up to
> explaining things a bit more clearly than I've managed in the past,
> by "abstracting out" just adaptation and registration from the many
> cognate issues (you can find more real-world approaches in Zope,
> Twisted or PyProtocols, but perhaps I can better show the underlying
> idea by sidestepping practical concerns such as performance
> optimization and effective error-checking).
>
> Suppose, just for simplicity and to separate this issue from issues
> related to "how should interfaces or protocols be formalized", that
> we choose, as the one and only way to identify each protocol, a
> unique-string, such as the half-hearted 'com.python.stdlib.index'
> example I mentioned earlier.  It's probably an oversimplification --
> it entirely leaves the issue of defining what syntactic, semantic,
> and pragmatic assurances a protocol makes up to human-readable
> documentation, and attempts no enforcement of such promises; the
> reason I posit this hypothesis now is to sidestep the long,
> interesting and difficult discussion of that issue, and focus on
> adaptation and registration.
>
> Second simplification: let's ignore inheritance. Again, I'm not
> advocating this as a pratical approach: the natural choice, when
> looking for an adaptation of type T to protocol P, is instead, no
> doubt, to walk T's MRO, looking for the adaptation of each of T's
> bases, and stop when the first one is met, and this has many
> advantages (but also some issues, partly due to the fact that
> sometimes inheritance is used more for implementation purposes than
> for Liskovian purity, etc, etc).  In the following, however, I'll
> just assume that "adaptation is not inherited" -- just as a
> simplification.  Similarly, I will assume "no transitivity":
> adaptation in the following is assume to go from a type T to a
> protocol P (the latter identified by a unique string), in a single
> step which either succeeds or fails, period. The advantages of
> transitivity, like those of inheritance, are no doubt many, but they
> can be discussed separately. I do not even consider the possibility
> of having an inheritance structure among protocols, even though it
> might be very handy; etc, etc.
>
> So, each entity which we can call a "registration of adaptation" (ROA
> for short) is a tuple ( (T, P), A) where:
>      T is a type;
>      P is a unique string identifying a protocol;
>      A is a callable, such that, for any direct instance t of T
> (i.e., one such that type(t) is T), A(t) returns an object (often t
> itself, or a wrapper over t) which claims "I satisfy all the
> constraints which, together, make up protocol P".  A(t) may also
> raise some exception, in which case the adaptation attempt fails and
> the exception propagates.
>
> For example, A may often be the identity function:
>
> def identity(x):
>      return x
>
> where a ROA of ((T, P), identity) means "all direct instances of T
> satisfy protocol P without need for wrappers or whatever".
>
> The adaptation registry is a mapping from (T, P) to A, where each ROA
> is an item -- (T, P) the key, A the value.  The natural
> implementation would of course be a simple dict.
>
>
> Consider for example the hypothetical protocol P
> 'com.python.stdlib.index'.  We could define it to mean: an object o
> satisfies P iff int(o) is directly usable as index into a sequence,
> with no loss of information.
>
> So, a sequence type's __getitem__ would be...:
>      def __getitem__(self, index):
>          adapted_index = adapt(index, 'com.python.stdlib.index')
>          i = int(adapted_index)
>          # continue indexing using the int 'i', which is asserted to
> be the correct index equivalent of argument 'index'
>
> instead of today's (roughly):
>
>      def __getitem__(self, index):
>          i = index.__index__()
>          # etc, as above
>
> Whoever (python-dev, in this case;-) invents the protocol P, besides
> using it in methods such as __getitem__, might also pre-register
> adapters (often, identity) for the existing types which are known to
> satisfy protocol P.  I.e., during startup, it would execute:
>
> register_adapter(int, 'com.python.stdlib.index', identity)
> register_adapter(long, 'com.python.stdlib.index', identity)
>
> This is similar to what the inventor would do today (inserting an
> __index__, at Python level, or tp_index, at C level, into the types)
> but not "invasive" of the types (which doesn't matter here, since the
> protocol's inventor also owns the types in question).
>

I am going to assume an optimization is possible where if an object
meets a protocol it doesn't need it's contract explicitly stated.  Is
that reasonable?  The reason I ask is I could see an explosion of
registration calls for objects trying to cover every protocol they
match and then new protocols that have been defined doing for the
objects, etc., and ending up still with some protocols missed since it
seems to require some knowledge of what types will work.

Take our __index__ example.  I might want to use an object that I
think can be used for indexing a sequence but I don't know about any
specific protocols required and neither the __index__ protocol creator
nor the object designer knew of each other and thus didn't bother with
registering.  Is it still going to work, or am I going to get an
exception saying that the object didn't register for some protocol I
wasn't aware of?

Also, if defaults are not implied, then a good way to handle
registration of classes will need to be developed.  This might be
another place where class decorators come in handy over metaclasses
since if inheritance comes into play then registering every subclass
would be overkill.

>
> Once the protocol P is published, the author of a type T which does
> respect/satisfy P would also execute register_adapter calls (instead
> of modifying T by adding __index__/tp_index).  Say that instances of
> T do already satisfy the protocol, then identity is the correct adapter:
>
> register_adapter(T, 'com.python.stdlib.index', identity)
>
> But say that some T1 doesn't quite satisfy the protocol but can
> easily be adapted to it.  For example, instances t of T1 might not
> support int(t), or support it in a way that's not correct for the
> protocol, but (again for example) T1 might offer an existing as_index
> method that would do what's required.  Then, a non-identity adapter
> is needed:
>
> class wrapT1(object):
>      def __init__(self, t):
>          self.t = t
>      def __int__(self):
>          return self.t.as_index()
>
> register_adapter(T1, 'com.python.stdlib.index', wrapT1)
>
>
> The big deal here is that protocol P and type T1 may have been
> developed independently and separately, and yet a third-party author
> can still author and register such a wrapT1 adapter, and as long as
> that startup code has executed, the application will be able to use
> T1 instances as indexes into sequences "transparently".  The author
> of the application code need not even be aware of the details: he or
> she may just choose to import a third-party module
> "implement_T1_adaptations.py" just as he or she imports the framework
> supplying T1 and the one(s) using protocol P.  This separation of
> concerns into up to 4 groups of developers (authors, respectively,
> of: a framework defining/using protocol P; a framework supplying type
> T1; a framework adapting T1 to P; an application using all of the
> above) always seems overblown for "toy" examples that are as simple
> as this one, of course -- and even in the real world in many cases a
> developer will be wearing more than one of these four hats.  But
> adaptation *allows* the separation of "ownership" concerns by
> affording *non-invasive* operation.
>
>
> Here is a simple reference implementation of adaptation under all of
> these simplifying assumptions:
>
> _global_registry = {}
>
> def register_adapter(T, P, A, registry=_global_registry):
>      registry[T, P] = A
>
> def adapt(t, P, registry=_global_registry):
>      return registry[type(t), P]
>
>
> Now, a million enrichments and optimizations can obviously be
> imagined and discussed: the many simplifying assumptions I've been
> making to try to isolate adaptation and registration down to its
> simplest core leave _ample_ space for a lot of that;-).  But I hope
> that instead of immediately focusing on (premature?-) optimizations,
> and addition of functionality of many kinds, we can focus on the
> salient points I've been trying to make:
>
> a. protocols may be identified quite arbitrarily (e.g. by unique-
> strings), though specific formalizations are also perfectly possible
> and no doubt offer many advantages (partial error-checking, less
> indirectness, ...): there is no strict need to formalize interfaces
> or protocols in order to add protocol-adaptation to Python
>
> b. the possibility of supplying, consuming and adapting-to protocols
> becomes non-invasive thanks to registration
>
> In this simplified outline I've supported *only* registration as the
> one and only way to obtain adaptation -- conceptually, through the
> identity function, registration can indeed serve the purpose,
> although optimizations are quite obviously possible.
>
> I should probably also mention what I mean by "protocol": it's a very
> general term, potentially encompassing interfaces (sets of methods
> with given signatures -- a "syntactic" level), design-by-contract
> kinds of constraints (a "semantic" level), and also the fuzzier but
> important kinds of constraints that linguists call "pragmatic" (for
> example, the concept of "integer usable as an index into a sequence
> without significant loss of information" is definitely a pragmatic
> constraint, not formalizable as semantics).
>

If we can make the default case for when an object implements a
protocol dead-simple (if not automatic) in terms of registering or
doing the right thing, then I can see this being really helpful.

-Brett


More information about the Python-3000 mailing list