[Web-SIG] Web Bus event graphs

Thu Jun 28 19:13:21 CEST 2007

Graham Dumpleton wrote:
> A question about about the idea of bus.start() like event to 
> indicate startup.
> 
> Problem with this is that under mod_wsgi the actual web server child
> process has possibly started long before a request may come in which
> targets a specific WSGI application. This is because loading of a WSGI
> application is effectively done by lazy loading, ie., code file only
> gets loaded when URL for a request maps to it.
> 
> This is different to where a Python based web server is used as
> generally one would in the program script load in the WSGI application
> before you even start the web server, as you would need to get the
> application entry point to be able to construct the fixed URL entry
> point for the root. Pylons and Paste may be an exception to this as
> not sure at what point it actually will load things.
> 
> How do you see being able to handle a startup like event in that case
> for a WSGI application when they aren't effectively being preloaded?
> How would you notify just that one application when it does finally
> get loaded, or do you?

In terms of the "site event bus" model, I would just say that lazy
applications join the start/stop cycle a bit later. They miss the first
"start" notification, so they'd either have to not subscribe to the
'start' channel at all, or would have to call their start listeners
manually on load/first request.

> ...the actual web server child process has possibly started
> long before a request may come in...

That reminds me, I wanted to also discuss another potential channel pair
for managing per-thread resources. CherryPy has an (on_start_thread,
on_stop_thread) pair for registering such callbacks.

Currently, CP invokes *_thread events by checking thread ID's on each
request. If the thread ID has been seen before (there's a set of "seen
thread IDs"), nothing happens; if it hasn't been seen, then
on_start_thread listeners are invoked. Since that chunk of code has to
work with various multithread schemes, the on_stop_thread listeners
aren't called until server shutdown (!).

That's pretty inefficient on its own, but when several WSGI components
in the stack all maintain their own map of seen threads, it becomes
unwieldy pretty quickly. If "the site" could notify such listeners, it
would be more accurate ("thread stop" events would fire when the thread
actually stops) and take less memory, since the site controller would be
the only code with a thread map (and probably already has one anyway).

This isn't limited to threads, by the way. When people talk about
"per-thread" resources, that can usually be safely commuted to
"per-logical-process", where "logical process" encompasses threads,
processes (since they have a main thread, at least), and even tasklets
(Arnar Birgisson is working on a Stackless WSGI server as we speak).

Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org