[Web-SIG] Standardized configuration
grahamd at dscpl.com.au
Sun Jul 17 12:04:48 CEST 2005
On 17/07/2005, at 6:16 PM, Ian Bicking wrote:
>> The pipeline itself isn't really late bound. For instance, if I was
>> create a WSGI middleware pipeline something like this:
>> server <--> session <--> identification <--> authentication <-->
>> <--> challenge <--> application
>> ... session, identification, authentication, and challenge are
>> middleware components (you'll need to imagine their implementations).
>> And within a module that started a server, you might end up doing
>> something like:
>> def configure_pipeline(app):
>> return SessionMiddleware(
> This is what Paste does in configuration, like:
> SessionMiddleware, IdentificationMiddleware,
> AuthenticationMiddleware, ChallengeMiddleware])
> This kind of middleware takes a single argument, which is the
> application it will wrap. In practice, this means all the other
> parameters go into lazily-read configuration.
Sorry, but you have given me a nice opening here to hijack this
a bit and make some comments and pose some questions about WSGI that I
been thinking on for a while.
My understanding from reading the WSGI PEP and examples like that above
that the WSGI middleware stack concept is very much tree like, but
any specific node within the tree, one can only traverse into one
a parent middleware component could make a decision to defer to one
another, but there is no means of really trying out multiple choices
you find one that is prepared to handle the request. The only way
seems to be make the linear chain of nested applications longer and
something which to me just doesn't sit right. In some respects the need
the configuration scheme is in part to make that less unwieldy.
To explain what I am going on about, I am going to use examples from
work I have been doing with componentised construction of request
stacks in mod_python. I will not use the term middleware here, as I
someone here in this discussion has already made the point of saying
the components being talked about here aren't really middleware and in
I have been doing I have been taking it to an even more fine grained
I believe I can draw a reasonable analogy to mod_python as at the
a mod_python request handler and a WSGI application are both providing
most basic function of proving the service for responding to a request,
they just do so in different ways.
Normally in mod_python a handler can return an OK response, an error
or a DECLINED response. The DECLINED response is special and indicates
mod_python that any further content handlers defined by mod_python
skipped and control passed back up to Apache so that it can potentially
serve up a matched static file.
What I am doing is making it acceptable for a handler to also return
If this were returned by the highest level handler, it would equate to
the same as DECLINED, but within the context of middleware components it
has a lightly relaxed meaning. Specifically, it indicates that that
isn't returning a response, but not that it is indicating that the
as a whole is being DECLINED causing a return to Apache.
Doing this means that within the context of a tree based middleware
at a particular node in the stack one can introduce a list of handlers
a particular node. Each handler in the list will in turn be tried to see
if it wishes to handle the response, returning either an error or valid
response, or None. If it doesn't raise a response, the next handler in
list would be tried until one is found, and if one isn't, then None is
back to the parent middleware component.
This all means I could write something like:
handler = Handlers(
This handler might be associated with any access to a directory as a
In iterating over each of the handlers it filters out requests to files
that we don't want to provide access to, with the final handler
to a handler within a Python module associated with the actual resource
being requested. Although Apache provides means of filtering out
it only works properly for physical files and not virtual resources
by way of the path info.
For example, a file "page.tmpl" (a Cheetah file) could have a "page.py"
file that defines:
handler = Handlers(
Again, more filtering and finally a handler is triggered which knows how
to trigger a precompiled Cheetah template stored as a Python module.
All in all a similar tree like structure to WSGI, except you have the
to iterate through handlers at one level with them being able to
define that they aren't providing a response and instead allowing the
handler to be tried.
My experience with this so far is that it has allowed more fine grained
components to be created which provide specific filtering without it
all turning into a mess due to having to nest each handler within
in a big pipeline as things seem they must be done in WSGI.
In mod_python one already has access to a table object storing
options set within the Apache configuration for mod_python, plus the
to add Python objects into the mod_python request object itself as
In terms of configuration, using this ability of a list of handlers
they don't actually return a response, seems to me to make it easier to
avoid having to have a separate configuration system for most stuff.
For example, I can have a handler "SetPythonOption" which sets an
the options table object and always returns None, thus passing control
the next handler. In the highest level handler before point where
is dispatched off to a separate Python module or special purpose
can thus define the configuration as necessary.
handler = Handlers(
In other words, the code itself contains the configuration and one
have to worry about where the configuration is found and working out
you may need from it. Of course you could still have a separate
object and provide a special purpose handler which merges that into the
environment of the request object in some way.
For this later case, inline with how its request object is used, you
have something like:
config = getApplicationConfig()
handler = Handlers(
Having done that, any later handler could access "req.config" to get
to the configuration object and use it as necessary. In WSGI such things
would be placed into the "environ" dictionary and propagated to
One last example, is what a session based login mechanism might look
since this was one of the examples posed in the initial discussion.
might have a handler for a whole directory which contains:
_userDatabase = _users.UserDatabase()
handler = Handlers(
# Create session and stick it in request object.
# Login form shouldn't require user to be logged in to access it.
# Serve requests against login/logout URLs and otherwise
# don't let request proceed if user not yet authenticated.
# Will redirect to login form if not authenticated.
Again, one has done away with the need for a configuration files as the
itself specifies what is required, along with the constraints as to what
order things should be done in.
Another thing this example shows is that handlers when they return None
to not returning an actual response, can still add to the response
in the way of special cookies as required by sessions, or headers
In terms of late binding of which handler is executed, the
handler is one example in that it selects which Python module to load
when the request is being handled. Another example of late construction
an instance of a handler in what I am doing, albeit the same type, is:
self.__req = req
self.__req.content_type = "text/html"
handler = IfExtensionEquals("html",HandlerInstance(Handler))
First off the "HandlerInstance" object is only triggered if the request
against this specific file based resource was by way of a ".html"
extension. When it is triggered, it is only at that point that an
of "Handler" is created, with the request object being supplied to the
To round this off, the special "Handlers" handler only contains the
code. Pretty simple, but makes construction of the component hierarchy
easier in my mind when multiple things need to be done in turn where
isn't strictly required.
self.__handlers = handlers
if len(self.__handlers) != 0:
for handler in self.__handlers:
result = _execute(req,handler,lazy=True)
if result is not None:
Would be very interested to see how people see this relating to what is
with WSGI. Could one instigate a similar sort of class to "Handlers" in
to sequence through WSGI applications until one generates a complete
The areas that have me thinking the answer is "no" is that I recollect
saying that the "start_response" object can only be called once, which
applications in a list adding to the response headers without returning
status. Secondly, if "start_response" object hasn't been called when
starts to try and construct the response content from the result of
application, it raises an error. But then, I have a distinct lack of
knowledge on WSGI so could be wrong.
If my thinking is correct, it could only be done by changing the WSGI
to support the concept of trying applications in sequence, by way of
as the status when "start_response" is called to indicate the same as
when I return
None from a handler. Ie., the application may have set headers, but
parent should where possible move to a subsequence application and try
Anyway, people may feel that this is totally contrary to what WSGI is
all about and
not relevant and that is fine, I am at least finding it an interesting
play with in respect of mod_python at least.
BTW, WSGI itself could just become a plugable component within this
middleware equivalent. :-)
handler = Handlers(
Feedback most welcome. I have been trying to work out how what I am
transfered to WSGI for a little while, but if people think it is a
then I'll no longer waste my time on thinking about it and just stick
More information about the Web-SIG