[IPython-dev] ipython1 and synchronous printing of stdout

Thu Jul 24 16:23:53 EDT 2008

I fear that I may have muddied the waters in talking about
notification and delegation all together. They really solve two
different problems. I'll try to lay that out below, but the short
version is that notification/observation seems to be the most
appropriate when the flow of information is unidirectional (e.g. a
frontend wants to know that something was printed to stdout), whereas
delegation is more appropriate when the flow is bidirectional (e.g.
the Shell wants to give a delegate the opportunity to modify its
internal behavior). I think that Brian and I are really talking about
the former, while Gael and Fernando are really talking mostly about
the later. I'll try to make the case below that delegation is not the
right solution for the observer problem, but that it may also be very
useful.

I will be truly offline until Sunday, so I'll say my piece here, and
let the discussion run its course.

On Thu, Jul 24, 2008 at 12:22 AM, Brian Granger <ellisonbg.net at gmail.com> wrote:
>> I think there's a fourth, that Gael mentioned in passing and which
>> we've been discussing here face to face for a while: using objects
>> whose basic behavior defines the core API, but which can be given at
>> construction time instead of being hardcoded in our various __init__
>> methods.  This allows subclasses (or even non-subclassing application
>> writers who use the code as a library to build their own tools) to
>> cleanly provide only the level of behavior modification they need.
>
> Yes, this is a design pattern that I very much like and that is well
> suited for certain things in ipython.  We actually use this in a few
> places in the ipython1 code:
>
> 1.  An IPython engine takes an interpreter class as a parameter in its
> init method.  Thus a user can pass the engine a custom interpreter
> subclass to get custom behavior.
>
> 2.  We don't do it yet, but the interpreter could take a Traited dict
> like object to use as the users namespace.

I agree that parameterizing many of the iteractions is useful
(especially for testing). When we want to be able to modify the
Interperter's behavior without subclassing Interpreter, this is
probably the right approach. However, if we believe that the
intersection of notifications that the set of delegates (e.g. the
frontends) may want to recieve is non-empty, the observer pattern may
be more appropriate *for event notification*...

As Brian points out below, delegation can lead to some unexpected
dependencies. Let me give one other example: suppose we use subclasses
of parameterized objects to add behavior to the interpreter. In this
scenario, the Interpreter constructor gets a parameter to which it
delegates responsibility for handling event notification (by callback,
observer pattern, etc.). Suppose Gael implements this object to use
callbacks for the events he needs in his Wx frontend. I also implement
a subclass to provide an observer-pattern based implementation of
event handling. Fernando wants to avoid the overhead of
events/callbacks and so passes None as this paramteer (or provides a
subclass that has no-op methods) [1]. Now imagine that we want to add
new functionality to the Interpreter class that changes when or how
the Interpreter will call this parameterized object (such as calling
the delegate's wrote_to_stdout method when the Interpreter writes to
stdout). Even if the change does not directly affect the use-case for
which Gael and I designed our subclasses, all our code needs to be
reviewed to see if it is affected by this change in the Interpreter's
internal implementation (in the example case, all of the possible
delegates need to be checked to make sure they didn't use
wrote_to_stdout as a method name for a different purpose *and* the
common base class needs to be updated to respond to the
wrote_to_stdout method with a no-op). In other words, because there is
a direct contractual relationship between the Interpreter and these
subclasses, dependencies are likely. In the observer pattern, the
Interpreter could fire a new event, but it would be ignored by any
observer that didnt' explicitly register for that event type. No
changes needed outside of the Interpreter class. A real world example,
I believe, is the matplotlib backend architecture. The matplotlib
library uses a plugable backend module to render plots. Each backend
thus needs to be maintained in parallel, a difficult and growing
problem for the Matplotlib team.

However, as Brian points out, delegation is *very* useful and is
probably the best way to handle the user_ns use case.

>
> We definitely should do more of this.  But, I think this design
> pattern addresses a slightly different need than the
> callback/observer/notification stuff.  Here is why:
>
> The real benefit of the observer/notification model is that everything
> can be completely transient.  For example, you might have a notifier
> that observes the user's namespace.  But you might only want it to
> operate when the user opens a specific window.  Another example is
> when an observer is hooked up to an unreliable network connection.  So
> I guess I would ad another design constraint that I have until now
> kept in the back of my mind:  we need loose coupling that is also
> dynamic and transient.  And for these situations, I still think the
> observer pattern is the best solution.

Just to reiterate and paraphrase Brian, the observer pattern lets the
Interpreter not know *anything* about the frontend (even whether it
exists or not) that is observing its behavior and it lets the frontend
not know *anything* about the Interpreter's implementation except that
it will fire notifications for the events defined in its interface.
So, when a new event is added to the Interpreter, no observer code
needs to be modified (it doesn't even need to know that the new
notification is fired). If the Interpreter's implementation is changes
so that, e.g. notification of writing to stdout happens immediately
after writing or after some short delay, none of the observing code
needs to be reviewed as long as the original Interpreter API didn't
specify the exact time. In other words, using an intermediary
notification center enforces a loose coupling; the only dependency
between Interpreter and frontend is the API that defines the
event/notificaiton itself.

Using the observer pattern makes adding new event types to the shell
easy too. For example, we could write function decorators to fire
events before/after the function. So, assuming all stdout goes to a
print_stdout() method in shell or some such, then

print_stdout(...):
   blah...

becomes

@notify_after(STDOUT_EVENT_TYPE)
print_stdout(...):
   blah...

In other words, no code in the function that fires the notification
needs to be modified if the notification can be fired before or after
the entire methods completes. We benefit from code reuse in the notify
method and don't have to rewrite similar functionality for each
individual callback use (this was Brian's point that a properly
implemented callback system is, essentially, a notification center).
In the case of Fernando's use-case above, notify_after() can call
through directly to the print_stdout function. The only performance
penalty is one additional method call. Is that acceptable?

>
>> This seems to provide a simple solution to our design question: all we
>> need to do is to  clarify which objects are our core publicly
>> modifiable components in the API, and then users can tweak only the
>> parts they need.  Said objects can themselves provide
>> observer/delegation behavior if they so desire, but they can do it in
>> the most appropriate way for a given problem.  For example, someone
>> writing a Traits app (who's already committed to using Traits for
>> their own reasons) can simply stick in there traited versions of the
>> same things, and with essentially zero extra code they get the Traits
>> event model everyhwere.
>
> I worry that such interfaces _can_ at times be a bit too implicit and
> thus hide subtle behaviors and introduce code coupling that isn't
> obvious.  One example of a subtle behavior is the following.  Traits
> is based on a model that says this "an interface consists of a set of
> attributes that you simply get and set as atributes"  But, the second
> you go to propagate an interface over a network connection, you
> discover that it is really difficult.  Interfaces that "network
> friendly" tend to be i) be methods/functions that ii) have arguments
> with very basic types.
>
> We do have some examples in IPython.kernel where we propagate
> interfaces that are attribute based, but a good amount of extra work
> and care is required.
>
> Thus, while I think the idea of passing a Traited dict to the
> interpreter to observe the user's namespace is a beautiful idea, the
> reality is much more painful:  there is no straightforward way of
> propagating that interface over a network connection.
>
> With a standardize observer/notifier pattern that is based on methods,
> it is very straightforward to:
>
> * Propagate observation/notification events over a network
> * Unregister such events when network connections fail
> * Integrate such events with Twisted.
> * Integrate with other event loops like wx, qt, cocoa
>
> One thing that I hadn't realized is that the observer/notifier pattern
> has come up for Min and I in IPython.kernel a number of times, but we
> have never taken the time to really abstract it properly.
>
> .....a few minutes later...
>
> I just reread this email and I think I am talking in circles and a bit
> tired.  But, I do think we need to have a solution that is "network
> friendly".
>
> Cheers,
>
> Brian
>
> PS - starting tomorrow, I will be offline for about a week.
>
>
>> In a Cocoa environment, these objects could be lightly wrapped
>> versions of the originals that register (via a modified __getattr__
>> for example) the necessary observation/update calls into the
>> NSDistributedNotificationCenter that Cocoa provides, if that's the
>> best way to do it in such an environment.
>>
>> And someone who just needs a callback here and there can simply stick
>> their callback-enhanced object where they want.  I could even see this
>> being very handy for occasional debugging, by quickly activating
>> tracing versions of certain objects as needed.
>>
>> One thing that I like about this approach is that it doesn't
>> necessarily close the door on a *future* implementation of an
>> ipython-wide notification center, if we end up finding in practice
>> that it's really needed as the code grows.  But right now, it seems to
>> me that this simpler approach solves all of our problems, with the
>> advantage of introducing exactly *zero* (not small, but truly zero)
>> overhead for the normal use that has no customizations.

[1] Re: performance
I believe that Fernando's "no-op" parameter must incur *some*
overhead; either the Interpreter must test for None, or the
Interpreter must call a method in Fernando's object which does
nothing. In other words, there's no free lunch. I can't think of a way
to add the ability to notify observers and/or modify the behavior of
Interpreter via delegates with truely zero overhead while avoiding the
parallel implementation problem that I describe above.

>>
>> Does this sound reasonable to others, or did I totally miss this
>> particular boat?  I really want us all to be happy with the solution
>> we find to this so that we move forward reasonably convinced of it
>> being viable.
>>
>> Cheers,
>>
>> f
>>
>