Handling Com Events in Python (was Python and Windows Scripting Host)

Sat Aug 12 07:07:36 EDT 2000

"Mark Hammond" <MarkH at ActiveState.com> wrote in message
news:GV6l5.2068$gP5.20280 at news-server.bigpond.net.au...

    [snip]

I apparently caused serious confusion by mixing in the discussion
different concerns without clearly labeling each.  So let me try
to clarify.

a. part of what I was discussing was about "impedance matching"
    between COM stuff (reference-counted), and stuff that's not
    reference-counted, but otherwise GC'd.  That was because that
    matching had been a concern in the COM-enabling effort I had
    undergone before (not for Python, but for another, proprietary
    scripting language).

b. another part of what I was discussing was whether to hook a
    whole class to a source-interface (the normal COM way), or
    a specific bound-method to a specific event (aka 'delegate',
    Hejlsberg's darling -- found in Delphi, VJ++6, MS .NET).

I think both issues are going to be important in .NET <-> COM
connections, btw.  But in the context of the current thread,
(a) is rather a side-show; and (b) is a design choice that
basically makes a difference in convenience (to the scripting
programmer, and to the impedance-matcher-implementer), but is
not all that deep -- it's a SMOP to emulate whole-interfaces
via bound-methods, or vice versa, modulo some performance (not
major either way, it seems to me).

> > As Python, luckily, uses a reference-count approach that strongly
> > parallels COM's, the Python objects that 'shadow' (implement, or
> > proxy/facade) the COM objects could also emulate this kind of
> > behavior,
>
> Hrm.  I hope you aren't calling for the user to do an "unadvise" or
> equivilent.  I should have stated that it was a design goal _not_ to
> do that.  My first versions did indeed create a reference count, and
> an explicit release mechanism was obviously the first idea.  However,
> IMO, it sucks.
>
> I didn't want the user to remember to close their objects, otherwise
> leaking invisible COM objects (often implemented in their process) and
> all the nastiness that goes along with this.

That does make things more difficult, yes.  I can't entirely follow
the .NET mailing list from developmentor (too much volume), but I've
already seen many flames there on this kind of point (need for explicit
close vs. reference-counting's handy guarantee-of-finalization... but
only absent cycles).

But the case we have at hand is not all that general, and thus
admits of solutions which might not be generalizable.  To wit:

When methods of object A must refer to object B, and vice versa,
having a pointer to B as a part of A's state and vice versa does
create a loop.  The obvious alternative is to pass either of the
pointers among the arguments; say, for definiteness, if B is
always in some sense the initiator of the interaction, then A's
methods (that B calls, or arranges to be called) have a reference
to B among their parameters (overtly, or as a field of a parameter
that can also contain other utility stuff).

Besides avoiding the loop (which only exists transiently during
a call to A's methods -- a time at which you don't want anything
to be GC'd anyway), this also sometimes makes A more flexibly
reusable -- different objects B, B', B", ..., may make use of
A's methods, just by passing themselves (or arranging for
themselves to be passed) into them.

The 'verbosity' you so deplore would appear to be the price
to pay.  However, isn't it a case of explicit vs implicit?  The
Python mantra is "Explicit is better than Implicit".  The 'self'
argument is explicitly received in method-calls, for example,
while other languages leave it implicit.  There are substantial
semantic advantages to this explicitness which some would regard
as a lack of handy syntax-sugar.  It seems to me that the just-
as-explicit passing of "the other [non-self] object", also as a
parameter to the method, is very much in the same vein: a tad
more explicit/less implicit, a tad less syntactically sugary,
with substantial semantic advantages to compensate.  Doesn't that
make it a very Pythonic solution?

> If I read your mail correctly, the crux of it is that you suggest code
> like:
> --
> class HandlerClass:
>     def OnClick(self,source,pEvtObj):
>         return 1
>     # Other event handlers as appropriate
>
> # Make the event handler class - once for all events.
> aHandler=HandlerClass()
>
> # Connect our events - one line per event
> aHandler.clickCookie =
> mgrObj.Advise(source,'onclick',aHandler.OnClick)
>
> # Do our stuff.
>
> # Unhook - once per event
> mgrObj.UnAdvise(aHandler.clickCookie)

Please note, however, that there is no loop of references here.
"This" object has references to mgrObj, source, and aHandler; mgrObj
has references to source and aHandler (via the bound aHandler.OnClick
method); source has references to aHandler (via bound methods).

It all goes in just one direction.  The Advise/UnAdvise ability
could be bundled into the Python object 'source' if one really
preferred the risk of such a slight blurring of roles/identities
in exchange for slightly better concision of expression.  The
ability to call UnAdvise explicitly is an extra here, just like
being able to explicitly call close on a file object; if you do
not do that explicitly, it can be done implicity on the __del__
of mgrObj if/when this-object drops the last reference to it (or
on the __del__ of source, if you prefer the architecture where
source and mgrObj merge into just one object).

> Notwithstanding the code bloat, we are still left to communicating
> through global variables.  If the event handler and the main
> application code share state, it must be through global variables, or
> through the instance itself.  Sharing through globals doesnt appeal,
> and if we are sharing through the instance, then we are pretty close
> to what we have now...

I'm not sure which of the several instances involved you are
calling "the instance itself" in this context.

But let's give another example, assuming a mgrObj/source merge,
since you seem to prefer concision here, and an abstract event:

class SharedState:
    def __init__(self):
        self.foo=23
        self.bar=45

class Handler:
    def OnClick(self,source,details):
        share=source.sharedState
        share.foo=1+share.bar
        share.bar=2*share.foo

class Application:
    def __init__(self):
        self.myShare=SharedState()
        self.mySource=getTheSourceObject()
        self.mySource.sharedState=self.myShare
        self.myHandler=Handler()
        self.mySource.Advise('onclic',self.myHandler.OnClick)

No loops, no need to explicitly unadvise, no recourse to
global variables for access to shared state.

When whoever-is-creating-the-Application-object drops its
last reference to it, note that there will be no other
references to said application-object outstanding in this
little setup.  So the app-obj will in turn automatically
drop its refs to self.myShare (which goes down to 1); to
self.mySource (goes down to 0); to self.myHandler (goes
down to 1).  And self.mySource is going away, so it drops
the references to the sharedstate object (goes down to 0,
goes away) and to the handler object (ditto).

Price to be paid for this: essentially a set of design
decisions to make things explicit.  The shared-state,
for example, is made explicitly into an object, a leaf
object to be precise, one holding no references to any
other actors in this little pantomime.  A reference to
it is also explicitly set (expando-like) as attribute
of the source-object, for the handler-object's benefit.

In real cases, the shared-object would no doubt include
references to such things as GUI elements, etc, but
that does not preclude its being leaf _in the context
of this pantomime_; basically, the requirement for no
explicit-unadvise being necessary is that no other loop
obtains *apart* from the handler<->source we have
explicitly broken by making the source an argument
to the handler's method[s].  If other loops obtain,
it would do no good wrt them if the source and handler
objects were merged, anyway!

And the connection of source-event and handler-object
is also explicit, with an Advise method, rather than
made implicitly e.g. by Sourcename_Eventname a la VB
(you do appear to appreciate this last explicitude:-).

There is, as you put it, "code bloat".  All of the
explicit "self.whatever" can be said to 'bloat' the
code wrt a language where 'whatever' implcitly meant
'self.whatever'.  Ooops wait, that's Python -- ok,
then, all of the explicit 'share.foo' can be said to
'bloat' the code wrt a language where 'foo' implicitly
meant 'share.foo'.  Why is the former good and the
latter bad?-)

Another way to put it: the DispatchWithEvents architecture
solves the loop eventsourceobj<->eventhandlerobj by
fusing the two mutually-dependent objects into one.

My proposed 'source-as-argument' architecture solves
it by leaving only the source->handler branch as part
of the actual *state* of the set of objects, while
the handler->source dependency is obtained transiently
by the fact that the source is an argument to the
handler's method.  In other words:
-- the source must always know the handler[s] because
    an event can happen anytime and the source needs
    to know who to broadcast it to; but,
-- the handler needs to know about the source _only
    while it's specifically handling an event_; it
    has no more-enduring need-to-know.
So, by letting the source keep a reference to the
handler in its state, and having the handler receive
the reference to the source as an argument to its
method, we are modeling in our architecture the
actual need-to-know needs involved (in "temporal"
terms, that is).  We can be said to respect information
hiding principles; there is a match between the
structure of the architecture and that of the
needs it meets.

Note that all of this is actually independent from
the design-choice of having Advise use single
bound-methods, or, entire source-interfaces at a
time.  That, as I said, is a choice of convenience
for the user and implementer.  Being able to bind
any callable object to a given event can be handy
(the script-programmer could use functions rather
than bound-methods); but if several events are to
be handled, it's more convenient to call Advise
once than several times.  And for the implementer
avoiding 'impedance-mismatch' between a COM model
that's oriented to whole-interfaces, and a Python
model oriented to single-events bound to single
callable objects, would be a convenience too.  As
I said, I could live with either resolution of
this, just as in Java I could live with Hejlsberg's
delegates (VJ++6) even though I preferred the Java
approach of binding whole-interfaces.

Note, also, that the event-handling part of this
is reasonably independent from how the handler
and the app object share state.  We've solved the
source<->handler loop (just as you have in
DispatchWithEvents by merging the two objects);
other possible loops, such as app<->source or
app<->handler, may still exist (and the merging
in DispatchWithEvents would not make them any
easier to solve, on the contrary).  They can be
solved too (e.g. with explicit shared-state objs
and expando properties as in the above examples).
Having an explicit UnAdvise gives the script
programmer the choice of letting a loop exist,
and breaking it explicitly, if/when that's easier
for him or her.  I think it's a good choice to
have; Python is not about protecting people by
restricting their choices, is it?-)

> > Anyway, the point is that the user-level object (the Python
> > class HandlerClass here) *need NOT hold* a reference to
> > source.
>
> At the cost of extreme ugliness, IMO.  All this advising and passing
> sources as synthesised parameters etc really doesnt sound to usable to
> me.

Ugliness is in the eye of the beholder.  I call it explicitness
and find it quite Pythonic.

Usability is, I think, proven by prior art.  Regarding the
synthesized-parameters, that's what the DHTML object model
is all about: each event-handler method receives an event
object which has the source-object among its attributes.
I think the W3C has done an excellent job in that design,
and zillion of DHTML-scripters are using it everyday.

Regarding the Advise approach to explicit connection of a
handler to a source-interface (or a specific event), how
is that different from setting the onclic property of a
DHTML object, again?  The difference between:
    sourceElement.onclic=myClickFun
end
    sourceElement.Advise('onclic',myClickFun)
appears to be syntactic sugar.  I would slightly prefer
modeling it through a separate manager-object rather than
pretending each source-element has an 'Advise' method
(or a series of properties onthis,onthat,ontheother), but
that preference IS slight.

What IS "on the verge of unusable" for some needs is
HAVING to hardcode the source->handler connection, so
that events from a given source object are +always+
passed to an handler object that must be established
quite early in the source-object lifetime.  Changing
the handling on the fly, by Advise (or use whatever
other verb suits your fancy -- Connect, Handle, what
have you), is much more flexible than coding inside
a single handler-object some complex determination of
current-state.

> > Key advantage from my POV is that this lets
> > HandlerClass be stack-discipline and/or mark-and-sweep
> > garbage collected, without worry on 'impedance matching'
> > that to the reference-count discipline of COM gc.
>
> You are starting to lose me here.  What is "stack-discipline" in the
> context of Python code and objects?  Python and COM are both reference
> counted, and I dont see that changing any time soon, even taking into
> consideration Python's optional GC changes for 2.x.

Sorry, I hope I've clarified this now -- "my POV" here
referred to the work I had done on that *other* language.

> If you are arguing that GC will save the day later, then you should be
> advocating "make cycles, and to hell with it - the close() requirement
> will be gone one day".  However, it appears you are advocating we ask
> the programmer to jump through serious hoops to avoid these cycles at
> all.

The source<->handler loop goes away without 'serious hoops', IMHO.
Other loops may exist (and merging the source and handler identities
would do nothing to solve them) but that's a whole 'nother issue.

> > If I were to do it again --
>
> We _can_ do it again :-)  Events arent that widely used yet.  If there
> is something significantly better, we can use it and deprecate the old
> ones.   It can be very hard to get feedback on stuff that doesnt exist
> yet :-)  The people on the COM developers mailing list didnt have a
> better idea at the time, but one almost certainly exists :-)

OK, let's give it a try.

> Please post a sample of how the code would look, assuming 3 events to
> be hooked, and state shared between the mainline code and the event
> handling code.

With the Hejlsberg-like hook-a-single-callable-at-a-time model:

class SharedState:
    def __init__(self):
        self.foo=23
        self.bar=45

class Handler:
    def OnClick(self,source,details):
        share=source.sharedState
        share.foo=1+share.bar
        share.bar=2*share.foo
    def OnPluc(self,source,details):
        share=source.sharedState
        if details.xCoordinate > 237:
            share.foo=share.foo+share.bar
            share.bar=share.foo-15
        else:
            source.aPowerMethod(1234)

class Application:
    def __init__(self):
        self.myShare=SharedState()
        self.mySource=getTheSourceObject()
        self.mySource.sharedState=self.myShare
        self.myHandler=Handler()
        self.mySource.Advise('onclic',self.myHandler.OnClick)
        self.mySource.Advise('onpluc',self.myHandler.OnPluc)
        self.mySource.Advise('onparak',self.myHandler.OnPluc)

for example.  Here, we want to handle both 'onpluc' and
'onparak' events in the same way (depending on xCoordinate
of the event's details) and in this model the natural way is
to hook up the same callable to either.

If one dislikes the delegate-model, and wants to hook up
by whole interfaces instead, then it gets to:

class SharedState:
    def __init__(self):
        self.foo=23
        self.bar=45

class Handler(whateverBase()):
    def cmn(self,source,details):
        share=source.sharedState
        if details.xCoordinate > 237:
            share.foo=share.foo+share.bar
            share.bar=share.foo-15
        else:
            source.aPowerMethod(1234)
    def onclic(self,source,details):
        share=source.sharedState
        share.foo=1+share.bar
        share.bar=2*share.foo
    def onpluc(self,source,details):
        self.cmn(source,details)
    def onparak(self,source,details):
        self.cmn(source,details)

class Application:
    def __init__(self):
        self.myShare=SharedState()
        self.mySource=getTheSourceObject()
        self.mySource.sharedState=self.myShare
        self.myHandler=Handler()
        self.mySource.AdviseAll(self.myHandler)

Six of one, half a dozen of the other.  Here, the natural
way to handle two events the same way becomes to pull the
common code up into a separate utility method and call that.

Note that in either case we're ignoring the cookie[s]
returned by Advise or AdviseAll, since in this case we
do not need them for later UnAdvise.  That's OK too;
we have the OPTION to unadvise explicitly, just like a
file object MAY be explicitly closed, but there is no
COMPULSION to do it.

Many details can be changed (using a separate mgrObj
rather than self.mySource itself as the object on which
Advise or AdviseAll is called, etc, etc), but the key
idea to break the loop is:

a handling-method receives as a parameter an object
from which it may, if needed, retrieve important
information on-the-fly, rather than having to store
it earlier as part of its state.  That object may
be a fully synthetic one, as in DHMTL's event-object;
or it may share-identity with the source-object for
the event (with some synthetic possibilities given
by "expando" addition of properties to the object).
Retrieving the *source* from the argument is the
crucial part, as it breaks the source<->handler
loop.

Note that the use of expando properties is NOT crucial,
just one approach of many to breaking _other_ cycles
(connected with the need to share state between the
handling methods and the application object that is
overseeing it all).  Clearly, the shared-object state
might just as well be placed (by the application
object, or on the handler's __init__, etc etc) as
part of the handler's state and fetched from there.
That reduces flexibility, but, as we're not using that
flexibility here:

class SharedState:
    def __init__(self):
        self.foo=23
        self.bar=45

class Handler(whateverBase()):
    def cmn(self,source,details):
        share=self.sharedState
        if details.xCoordinate > 237:
            share.foo=share.foo+share.bar
            share.bar=share.foo-15
        else:
            source.aPowerMethod(1234)
    def onclic(self,source,details):
        share=self.sharedState
        share.foo=1+share.bar
        share.bar=2*share.foo
    def onpluc(self,source,details):
        self.cmn(source,details)
    def onparak(self,source,details):
        self.cmn(source,details)

class Application:
    def __init__(self):
        self.myShare=SharedState()
        self.mySource=getTheSourceObject()
        self.myHandler=Handler()
        self.myHandler.sharedState=self.myShare
        self.mySource.AdviseAll(self.myHandler)

You can see that the difference with the previous example
is TRULY minute!

> > I think I'd drop the use of
> > bound-methods/delegates and go back to whole-class
> > connection.  But maybe that's because hooking a whole
> > source-interface is what I'm doing anyway every time I
>
> sorry - you have lost me again.

Hope I've clarified this part.  I consider it minor, though
from the flames about delegates vs listener-interfaces on
the Java groups one wouldn't think that:-).

> > *blink* how does one do that short of getting the CVS tree
> > and adding C-level stuff?
>
> Why change things at the C level?  The C level code is basically
> exposing the C API to Python.  If you made changes in C, you would
> still be working at the same API level.

I've got to go back and study that part again, I guess.  I
had definitely not understood this from the docs; guess I'll
have to get the CVS tree again for the purpose of *studying*
the sources:-).

> > > NFI.  Do we have proof that IE actually fires it?  Maybe it only
> does
> > > on idle, or something strange?
> >
> > Fair enough!  So I whipped out my old trusty AuHelp.Listener object
>
> Damn.  So it does.  Hrm.  OK - back to NFI again :-)

That stands for "No Fine Idea", right?-)

> > > From the tutorial I gave at the last conference (powerpoint slides
> > > available from my starship page):
> >
> > I briefly looked around but can't find it -- got the URL please...?
>
> Main page -> Conferences link -> First conference linked
>
> http://starship.python.net/crew/mhammond/conferences/ipc8/

Thanks, got it.

Alex