[Python-3000] pep 3124 plans
Phillip J. Eby
pje at telecommunity.com
Mon Jul 30 21:45:33 CEST 2007
At 02:20 PM 7/30/2007 -0400, Jim Jewett wrote:
>On 7/21/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> >... If you have to use @somegeneric.before and
> > @somegeneric.after, you can't decide on your own to add
> > @somegeneric.debug.
>
> > However, if it's @before(somegeneric...), then you can add
> > @debug and @authorize and @discount and whatever else
> > you need for your
> > application, without needing to monkeypatch them in.
>
>I honestly don't see any difference here. @somegeneric.method implies
>that somegeneric is an existing object, and even that it already has
>rules for combining .before and .after; it can just as easily have a
>rule for combining arbitrary methods.
I don't understand what you're saying or how it relates to what I said above.
If you define a new kind of method qualifier (e.g. @discount), then
all existing generic functions aren't suddenly going to grow a
'.discount' attribute. That's what the above discussion is about --
how you *access* qualifier decorators.
>If you're saying that @discount could include its own combination
>rules, then each method needs to repeat the boilerplate to pick apart
>the current decision tree.
Still don't understand you. Method combination is done with a
generic function called "combine_actions" which takes two arbitrary
"method" objects and returns a new "method" representing their
combination. There is no boilerplate or picking anything apart.
> The only compensating "advantage" I see is
>that the decision tree could be changed arbitrarily from anywhere,
>even as "good practice." (Since my new @thumpit decorator would takes
>the generic as an argument, you won't see the name of the generic in
>my file; you might never see it there was iteration involved.)
Decision trees are generated from a flat collection of rules; they're
not directly manipulated. In the default implementation (based on
Guido's prototype), the "tree" is just a big dictionary mapping
tuples of types to "method" objects created by combining all the
methods whose signatures are implied by that tuple of types. It's
also sparse, in that it doesn't contain type combinations that
haven't been looked up yet. So there isn't really any tree that you
could "change" here.
There's just a collection of rules, where a rule consists of a
predicate, a definition order, a "body" (function), and a method
factory. A predicate is a collection of possible signatures (e.g.
the sequence of applicable types) -- i.e., an OR of ANDs.
To actually build a tree, rules are turned into a set of "cases",
where each case consists of one signature from the rule's predicate,
plus a method instance created using the signature, body, and
definition order. (Not all methods care about definition order, just
ones like before/after.)
In the default engine (loosely based on Guido's prototype), these
cases are merged by using combine_actions() on any cases with the
same signature, and stored in a dictionary called the
"registry". The registry is built up incrementally as you add methods.
When you call the function, a type tuple is built and looked up in
the cache. If nothing is found in the cache, we loop over the
*entire* registry, and build up a derived method, like this (actual
code excerpt):
try:
f = cache[types]
except KeyError:
# guard against re-entrancy looking for the same thing...
action = cache[types] = self.rules.default_action
for sig in self.registry:
if sig==types or implies(types, sig):
action = combine_actions(action, self.registry[sig])
f = cache[types] = action
return f(*args)
The 'self.rules.default_action' is to method objects what zero is to
numbers -- the start of the summing. Ordinarily, the default action
is a NoMethodFound object -- a perfectly valid "method"
implementation whose behavior is to raise an error. All other method
types have higher combination precedence than NoMethodFound, so it
always sinks to the end of any combination of methods.
The relevant generic functions here are implies(), combine_actions(),
and overrides() -- where combine_actions() calls overrides() to find
out which action should override the other, and then returns
overriding_action.override(overridden_action).
The overrides() relationship of two actions of the same type (e.g.
two Around methods), is defined by the implies() relationship of the
action signatures. For Before/After methods, the definition order is
used to resolve any ambiguity in the implies().
The .override() of a method is usually a new instance of the same
method type, but with a "tail" that points to the overridden method,
so that next_method will do the right thing.
There are more details than this, of course, but the point is that
method combination is 100% orthogonal to the dispatch tree
mechanism. You can build any kind of dispatch engine you want, just
by using combine_actions to combine the actions. The action types
themselves only need to know how to .override() a lower precedence
method and .merge() with a same-precedence method. And there needs
to be an overrides() relationship defined between all pairs of method
types, but in my current version of the implementation, overrides()
is automatically transitive for any type-level relationship.
So if you define a type that overrides Around, then it also overrides
anything that Around overrides. So, for the most part you just say
what types you want to override (and/or be overridden by), and maybe
add a rule for how to compare two methods of your type (if the
default of comparing by the implies() of signatures isn't sufficient).
The way that generic functions make this incredible orthogonality and
flexibility possible is itself an argument for generic functions,
IMO. Certainly, it's a hell of an argument for implementing generic
functions in terms of other generic functions, which is why I did
it. It beats the crap out of my previous implementation approaches,
which had way too much coupling between method combination and
tree-building and rules and cases and whatnot.
Separating these ideas into different functional/conceptual domains
makes the whole thing easier to understand -- as long as you're not
locked into procedural-implementation thinking. If you want to think
step-by-step, it's potentially a vast increase in complication. On
the other hand, it's like thinking about reference counting while
writing Python code. Sure, you need to drop down to that level every
now and then, but it's a waste of time to think about it 90% of the
time. Being able to have a class of things that you *don't* think
about is what makes Python a higher-level language than the C it's
implemented with.
In the same way, generic functions are a higher-level version of OO
-- you get to think in terms of a domain's abstract operations, like
implication, overriding, and combination in this example.
The domain abstractions are not an "interface", nor are they methods
or object types. They're more like "concepts", except that the term
"concept" has been abused to refer to much lower-level things that
can attach to only one object within an operation.
The concept of implication is that there are imply-ers and imply-ees
-- a role for each argument, each of which is an implicit interface
or abstract object type.
In traditional OO and even interfaces, there are considerable limits
to your ability to specify such partial interfaces and the
relationships between them, forcing you to choose arbitrary and
implementation-defined organization to put them in. You then have to
force-fit objects to have the right methods, because you didn't
define an x.is_implied_by(y) relationship, only a x.implies(y) relationship.
Thing is, a *relationship* doesn't belong to one side or the other --
it's a *relationship*. A third, independent thing. Like a GF method.
In any program, these relationships already exist, and you still have
to understand them. They're just forced into whatever pattern the
designer chose or had thrust upon them to make them fit the
at-best-binary nature of OO methods, instead of called out as
explicit relationships, following the form of the problem domain.
>I realize that subclasses are theoretically just as arbitrary, but
>they aren't in practice.
Right -- and neither are generic functions in normal usage. The only
reason you think that subclasses aren't arbitrary is because you're
used to the ways that things get force-fitted into those
relationships. Whereas, with GF's, the program can simply model the
application domain relationships, and you're going to know what
patterns will follow because they'll reflect the application domain.
For example, if you see implies() and combine_actions() and
overrides(), are you going to have any problems knowing when you see
a type, whether these GF's might have methods for that type? You'll
know when to *look* for such a method, because you know what roles
the arguments play in each GF. If the type might play such a role,
then you'll want to know *how* it plays that role in connection with
specific collaborators or circumstances -- and you'll know what
method implementations to look for.
It's ridiculously simple in practice, even though it sounds hard in
theory. That's the very problem in fact -- in neither subclassing
nor GF's can you solve such problems *in theory*. You can only solve
them in *practice*, because it's only in the context of a specific
program that you have any domain knowledge to apply -- i.e.,
knowledge about what general kinds of things the program is supposed
to do and what general kinds of things it does them with.
If you have that general knowledge, it's just as easy to handle one
organization as the other -- but the GF-based version gives you the
option of having a module that defines lots of basic "kinds of things
it's supposed to do" up front, so that you have an idea of how to
understand the "things it does them with" when you encounter them.
>You can certainly say now that configuration specialization should be
>in one place, and that dispatching on parameter patterns like
>
>(* # ignored
>, :int # actual int subclass
>, :Container # meets the Container ABC
>, 4<val<17.3 # value-specific rule
>)
>
>is a bad idea
But I *don't* say that. What I say is that in practice, there are
only a few natural places to *put* such a definition:
* near the definition of Container (or int, but that's a builtin in this case)
* near the definition of the generic function being overloaded
* in a "concern-based" grouping, e.g. an appropriate module that
groups together matters for some application-domain concept. (For
example, an "ordering_policy" module might contain overrides for a
variety of generic functions that relate to inventory, shipping, and
billing, within the context of placing orders.)
* in an application-designated catchall location
Which of these locations is "best" depends on the overall size of the
program. A one-module program is certainly small enough to not need
to pick one. As a system gets bigger, some of the other usage
patterns become more applicable.
>-- but whenever I look at an application from the
>outside, well-organized configuration data is a rare exception.
That may be -- but one enormous advantage of generic functions is
that you can always relocate your method definitions to a different
module or different part of the same module without affecting the
meaning of the program, as long as all the destination modules are
imported by the time you execute any of the functions.
In other words, if a program is messy, you can clean it up -- heck,
it's potentially safer to do with an automatic refactoring tool, than
other types of refactorings in Python. (e.g., changing the signature
of a 'foo()' method is difficult to do safely because you don't
necessarily know whether two arbitrary methods *named* 'foo' are
semantically the same, whereas generic functions are objects, not names.)
More information about the Python-3000
mailing list