[Python-3000] Generic functions
Phillip J. Eby
pje at telecommunity.com
Tue Apr 4 10:02:19 CEST 2006
At 11:03 PM 4/3/2006, Ian Bicking wrote:
>Guido van Rossum wrote:
>>On 4/3/06, Ian Bicking <ianb at colorstudy.com> wrote:
>>>As an alternative to adaptation, I'd like to propose generic functions.
>>> I think they play much the same role, except they are much simpler to
>>>use and think about.
>>Given that Phillip Eby is another proponent of generic functions I
>>seriously doubt the latter.
Hm. :)
Rather than branch all over the map, let me focus for a moment on a
very simple type of generic function - the kind that is essentially
equivalent to PEP 246-style adaptation. This will make it easier to
see the relationship.
In the RuleDispatch package, these simple generic functions are
defined using dispatch.on and f.when(), like this:
import dispatch
@dispatch.on('ob') # name of the argument to dispatch based on
def pprint(ob):
"""This is a pretty-print function"""
@pprint.when(object)
def pprint(ob):
print repr(ob)
@pprint.when(list)
def pprint(ob):
# code for the list case
Now, this is exactly equivalent to the much longer code that one
would write to define an IPrettyPrintable interface with a pprint()
method and adapter classes to define the implementation
methods. Yes, it's a convenient example - but it also corresponds to
a fairly wide array of problems, that also happen to be a significant
number of uses for adaptation.
As Ian mentioned, I wrote PyProtocols first, and discovered generic
functions afterward. In fact, 'dispatch.on()' and friends are
actually implemented internally using PyProtocols adapters. For
operations that do not require any state to be saved across method
invocations, and for interfaces with a small number of methods (e.g.
just one!), generic functions of this type are a more compact and
expressive approach.
The kicker is this: for this subset of generic functions, it doesn't
matter whether you make generic functions primitive, or you make
adaptation primitive, because either can be implemented in terms of
the other. For example, if you wanted to implement adaptation using
generic functions, you would just use a function for each interface
or protocol, and the "methods" of the function would be the adapter factories.
Anyway, this type of generic function is essentially similar to
Python's tables today for things like pretty printing, pickling, and
so on. It can be viewed as merely a syntactic convenience coupled
with standardization of how such an operation's registry will work,
both for registration and lookups.
In essence, Python's standard library already has generic functions,
they are just implemented differently for each operation that needs
such a registry.
>>Watch it though. it may be a great example to explain generic
>>functions. But it may be the only example, and its existence may not
>>be enough of a use case to motivate the introduction of gneric
>>functions.
It's sufficiently general to encompass just about any "visitor"
pattern. I wrote a short article on this a while back:
http://peak.telecommunity.com/DevCenter/VisitorRevisited
>>Whoa! First of all, my gut reaction is already the same as for
>>adaptation: having a single global registry somehow feels wrong. (Or
>>is it not global? "internal" certainly sounds like that's what you
>>meant; but for methods this seems wrong, one would expect a registry
>>per class, or something like that.)
This confusion is due to Ian mixing methods and functions in the
example. Generic functions are *functions*. If you put them in
classes, Python makes them methods. But there's nothing magical
about that - it's still a function. The VisitorRevisited article
explains this better, and in a way that doesn't delve into the
technicalities. Also, its examples are actually working ones based
on a fixed spec (i.e., the RuleDispatch implementation), so there's
no handwaving to get in the way.
>>Next, I wonder what the purpose of the PrettyPrinter class is. Is it
>>just there because the real pprint module defines a class by that
>>name? Or does it have some special significance?
>
>It's there because it is matching the pprint module. Also it holds
>some state which is useful to keep separate from the rest of the
>arguments, like the current level of indentation.
Note that this isn't really needed, nor necessarily ideal. If I were
really writing a pretty printer, I'd put indentation control in an
output stream argument, which would allow me to reuse an
IndentedStream class for other purposes. It would then suffice to
have a single pprint(ob,stream=IndentedStream(sys.stdout)) generic
function, and no need for a PrettyPrinter class.
>>Are generic functions
>>really methods? Can they be either?
>
>They can be either.
They're really honest-to-goodness Python *functions* (with extra
stuff in the function __dict__). That means they behave like
ordinary functions when it comes to being methods. They can be
instance, class, or static methods, or not methods at all.
>>Ah, the infamous "when" syntax again, which has an infinite number of
>>alternative calling conventions, each of which is designed to address
>>some "but what if...?" objection that might be raised.
RuleDispatch actually has a fixed number of when() signatures, and
the one Ian gave isn't one of them. Single-dispatch functions'
when() takes a type or an interface, or a sequence of types or
interfaces. Predicate-dispatch functions take a predicate object, or
a string containing a Python expression that will be compiled to
create a predicate object.
None of RuleDispatch's when() decorators take keyword arguments at
the moment, at least not in the way of Ian's example.
>>What does when(object=list) mean? Does it do an isinstance() check?
>
>Yes; I think RuleDispatch has a form (though I can't remember what
>the form is -- it isn't .when()).
For simple single-dispatch, it's @some_function.when(type). So in
this case it'd be @pprint.when(list).
>>Is there any significance to the name pformat_list? Could I have
>>called it foobar? Why not just pformat?
>
>Just for tracebacks, and for example to make it greppable.
Actually, there's one other reason you might want a separate name,
and that's to reuse the code in an explicit upcall. You could then
explicitly invoke pformat_list() on something that wasn't an instance
of the 'list' type. For example, you could just call
"pprint.when(SomeListLikeType)(pformat_list)" to register
pformat_list as the method to be called for SomeListLikeType, as well
as for 'list' and its subclasses.
If it were just called 'pformat', there'd be no name by which you
could access the original function. But you *are* allowed to call it
whatever you want. If it's called 'pformat' in this case, you just
lose access to it; the name 'pformat' remains bound to the overall
generic function, rather than to the specific implementation.
>>>* It requires cooperation from the original function (pformat -- I'm
>>>using "function" and "method" interchangably).
>>Thereby not helping the poor reader who doesn't understand all of this
>>as well as you and Phillip apparently do.
Right. Don't interchange them, they have different meanings. A
generic function is just a function. It's not a method unless you make it one.
Generic functions can *have* methods, however. Each *implementation*
for a generic function, like each reducer in the pickling registry,
is called a "method" *of* the generic function. This terminology is
lifted straight from Lisp, but I'm not attached to it. If anybody
has a better terminology, in fact, I'm all for it!
In CLOS, by the way, there are no object methods, only generic
function methods. Rather than add methods to classes, you add them
to the generic functions. As with adaptation, this is 100%
equivalent in computational terms.
What's *not* equivalent is the user interface. If you are writing a
closed-ended program in any language, it really doesn't matter
whether the objects have methods or the functions do. You're just
writing a program, so how you organize it is a matter of taste.
However, if you are writing an extensible library, then generic
functions (especially multiple-dispatch ones) have a *tremendous*
advantage, because you aren't forced to pick only one way to
expand. And if you don't have generic functions in your language,
you'll just reinvent them in your library -- like with pickle and
copy_reg in the Python stdlib.
The advantage that they offer is twofold: 1) you can add new
operations that cut across all types, existing or future. 2) you can
add new types that can work with any operation, existing or future.
Side note: Ruby dodges this problem by making classes
open-ended. Since you can add a new __special__ method to any
existing class (even one you didn't write, including built-in types),
you don't need to have operation-specific type lookup registries (ala
pickle/copy_reg), and for that matter you don't need adaptation! The
only problem you might run into is namespace clashes, and I'm not
sure how Ruby addresses this.
Anyway, there are lots of ways to skin these cats, the main
difference is in how the user sees things and expresses their
ideas. If you are creating a closed system used by a single
developer, you don't need any of this. But if you need an extensible
library, you need a mechanism for extension that allows third parties
to implement operation A for type B, even if they did not create the
libraries containing A and B. Adaptation, generic functions, and
Ruby-style open classes are all computationally-equivalent mechanisms
for doing this, in that you could translate a given library to use
any of the three techniques.
>>I'm guessing that you are contrasting it with len(), which could be
>>seen as a special kind of built-in "generic function" if one squints
>>enough, but one that requires the argument to provide the __len__
>>magic method. But since len() *does* require the magic method, doesn't
>>that disqualify it from competing?
You could view this as being a generic function with only one method:
@len.when(object)
def len(ob):
return ob.__len__()
And which of course is not extensible. :)
>Yes, this is in contrast with len(), which achieves its goal because
>the people who write the len() function write the entire language,
>and can put a __len__ on whatever they want ;) For other cases
>magic-method based systems tend to look like:
>
>def pprint(object):
> if isinstance(object, list): ...
> elif isinstance(object, tuple): ...
> ...
> elif hasattr(object, '__pprint__'):
> object.pprint()
> else:
> print repr(object)
>
>That is, all the built in objects get special-cased and other
>objects define a magic method.
Yeah, and then there are all the libraries like pydoc that have these
huge if-else trees and aren't extensible because they don't even have
a magic method escape or any kind of registry you can extend. A
uniform way to do this kind of dispatching means uniform
extensibility of libraries.
>>>* The function is mostly self-describing.
>>Perhaps once you've wrapped your head around the when() syntax. To me
>>it's all magic; I feel like I'm back in the situation again where I'm
>>learning a new language and I haven't quite figured out which
>>characters are operators, which are separators, which are part of
>>identifiers, and which have some other magical meaning. IOW it's not
>>describing anything for me, nor (I presume) for most Python users at
>>this point.
I'd be interested to know if you still feel that way after reading
VisitorRevisited, since it doesn't do any hand-waving around what when() does.
>>Tell us more about the registration machinery. Revealing that (perhaps
>>simplified) could do a lot towards removing the magical feel.
My simple generic function implementation is actually implemented
using adaptation - I create a dummy interface for the generic
function, and then define adapters that return the implementation
functions. IOW, adapt(argument,dummy_interface) returns the function
to actually call. The actual generic function object is a wrapper
that internally does something like:
def wrapper(*args,**kw):
return adapt(some_arg, dummy_interface)(*args, **kw)
However, if I were implementing a generic function like this "from
scratch" today, I'd probably just make the function __dict__ contain
a registry dictionary, and the body of the function would be more like:
def wrapper(*args,**kw):
for cls in type(some_arg).__mro__:
if cls in wrapper.registry:
return wrapper.registry[cls](*args, **kw)
Voila. Simple generic functions. Of course, the actual
implementation today is complex because of support for classic
classes and even ExtensionClasses, and having C code to speed it up, etc.
>>>* Magic methods do *not* have this import problem, because once you have
>>>an object you have all its methods, including magic methods.
>>Well, of course that works only until you need a new magic method.
>
>Yes, pluses and minuses ;)
Ruby of course solves this by letting you add the magic methods to
existing classes, and Python allows this for user-defined types,
although we look down on it as "monkey patching".
>>>Type-based generic functions and adaptation are more-or-less equivalent.
>>> That is, you can express one in terms of the other, at least
>>>functionally if not syntactically.
>>Could you elaborate this with a concrete example?
Here's pprint() written as an interface and adaptation:
class IPPrintable(Interface):
def pprint():
"""pretty print the object""
def pprint(ob):
adapt(ob, IPPrintable).pprint()
In reality, this would be much more code, because I've left out the
adapter classes to adapt each type and give it a pprint() method.
>>>Anyway, I think generic functions are very compatible with Python syntax
>>>and style, and Python's greater emphasis on what an object or function
>>>can *do*, as opposed to what an object *is*, as well as the use of
>>>functions instead of methods for many operations. People sometimes see
>>>the use of functions instead of methods in Python as a weakness; I think
>>>generic functions turns that into a real strength.
>>Perhaps. The import-for-side-effect requirement sounds like a
>>showstopper though.
In practice, it doesn't happen that often, because you're either
adding an operation to some type of yours, or adding a type to some
operation of yours. And in the case where you're adding operation A
to type B and you own neither of the two -- you're still not
importing for side-effect except in the sense that the code that
defines the A(B) operation is part of your application anyway. The
case where there is some *third-party* module that implements A(B) is
quite rare in my experience.
However, you could have such a scenario *now*, if "operation A" is
"pickle", for example, and B is some type that isn't ordinarily
picklable. Such a situation requires importing for side effects now,
if you're not the one who wrote the pickling operation for it.
In fact, it doesn't matter *what* approach you use to provide
extensibility; if it allows user C to define operation A on type B,
it requires user D to import C's code for side-effects if they want
to use it. This equally applies to adaptation and to Ruby's open
classes, as it does to generic functions. It's simply the logical
consequence of having third-party extensibility!
More information about the Python-3000
mailing list