[Web-SIG] Standardized configuration
Ian Bicking
ianb at colorstudy.com
Sun Jul 17 10:16:14 CEST 2005
Chris McDonough wrote:
>>Because middleware can't be introspected (generally), this makes things
>>like configuration schemas very hard to implement. It all needs to be
>>late-bound.
>
>
> The pipeline itself isn't really late bound. For instance, if I was to
> create a WSGI middleware pipeline something like this:
>
> server <--> session <--> identification <--> authentication <-->
> <--> challenge <--> application
>
> ... session, identification, authentication, and challenge are
> middleware components (you'll need to imagine their implementations).
> And within a module that started a server, you might end up doing
> something like:
>
> def configure_pipeline(app):
> return SessionMiddleware(
> IdentificationMiddleware(
> AuthenticationMiddleware(
> ChallengeMiddleware(app)))))
>
> if __name__ == '__main__':
> app = Application()
> pipeline = configure_pipeline(app)
> server = Server(pipeline)
> server.serve()
This is what Paste does in configuration, like:
middleware.extend([
SessionMiddleware, IdentificationMiddleware,
AuthenticationMiddleware, ChallengeMiddleware])
This kind of middleware takes a single argument, which is the
application it will wrap. In practice, this means all the other
parameters go into lazily-read configuration.
You can also define a "framework" (a plugin to Paste), which in addition
to finding an "app" can also add middleware; basically embodying all the
middleware that is typical for a framework.
Paste is really a deployment configuration. Well, that as well as stuff
to deploy. And two frameworks. And whatever else I feel a need or
desire to throw in there.
Note also that parts of the pipeline are very much late bound. For
instance, the way I implemented Webware (and Wareweb) each servlet is a
WSGI application. So while there's one URLParser application, the
application that actually handles the request differs per request. If
you start hanging more complete applications (that might have their own
middleware) at different URLs, then this happens more generally.
There's a newish poorly tested feature where you can do urlmap['/path']
= 'config_file.conf' and it'll hang the application described by that
configuration file at that URL.
> The pipeline is static. When a request comes in, the pipeline itself is
> already constructed. I don't really want a way to prevent "improper"
> pipeline construction at startup time (right now anyway), because
> failures due to missing dependencies will be fairly obvious.
I think that's reasonable too; it's what Paste implements now.
> But some elements of the pipeline at this level of factoring do need to
> have dependencies on availability and pipeline placement of the other
> elements. In this example, proper operation of the authentication
> component depends on the availability and pipeline placement of the
> identification component. Likewise, the identification component may
> depend on values that need to be retrieved from the session component.
Yes; and potentially you could have several middlewares implementing the
same functionality for a single request, e.g., if you had different kind
of authentication for part of your site/application; that might shadow
authentication further up the stack.
> I've just seen Phillip's post where he implies that this kind of
> fine-grained component factoring wasn't really the initial purpose of
> WSGI middleware. That's kind of a bummer. ;-)
Well, I don't understand the services he's proposing yet. I'm quite
happy with using middleware the way I have been, so I'm not seeing a
problem with it, and there's lots of benefits.
> Factoring middleware components in this way seems to provide clear
> demarcation points for reuse and maintenance. For example, I imagined a
> declarative security module that might be factored as a piece of
> middleware here: http://www.plope.com/Members/chrism/decsec_proposal .
Yes, I read that before; I haven't quite figured out how to digest it,
though. This is probably in part because of the resource-based
orientation of Zope, and WSGI is application-based, where applications
are rather opaque and defined only in terms of function.
> Of course, this sort of thing doesn't *need* to be middleware. But
> making it middleware feels very right to me in terms of being able to
> deglom nice features inspired by Zope and other frameworks into pieces
> that are easy to recombine as necessary. Implementations as WSGI
> middleware seems a nice way to move these kinds of features out of our
> respective applications and into more application-agnostic pieces that
> are very loosely coupled, but perhaps I'm taking it too far.
Certainly these pieces of code can apply to multiple applications and
disparate systems. The most obvious instance right now that I think of
is a WSGI WebDAV server (and someone's working on that for Google Summer
of Code), which should be implemented pretty framework-free, simply
because a good WebDAV implementation works at a low level. But
obviously you want that to work with the same authentication as other
parts of the system.
I guess this is how I come back to lazily introducing middleware. For
instance, some "application" (which might be a fairly small bit of
functionality) might require a session. If there's no session
available, then it can probably make a reasonable session itself. But
it shouldn't shadow any session available to it, if that's already
available. This is doubly true for something more authoritative like
authentication.
>> I think authorization is different, and is conflated in
>>paste.login, but I don't have any many use cases where it's a useful
>>distinction. I guess there's a number of ways of getting a username and
>>password; and to some degree the authenticator object works at that
>>level of abstraction. And there's a couple other ways of authenticating
>>a user as well (public keys, IP address, etc). I've generally used a
>>"user manager" object for this kind of abstraction, with subclassing for
>>different kinds of generality (e.g., the basic abstract class makes
>>username/password logins simple, but a subclass can override that and
>>authenticate based on anything in the request).
>
>
> Sure. OTOH, Zope 2 has proven that inheritance makes for a pretty awful
> general reuse pattern when things become sufficiently complicated.
True. But part of that is having a clear internal and external
interface. The external interface -- which you can implement without
using the abstract (convenience) superclass -- should be small and
explicit. I've found interfaces a useful way of adding discipline in
this way, even though I've never really used them at runtime.
But I think it's reasonable to use inheritance for convenience sake, so
long as you don't implement more than one thing in a class.
>>As long as it's properly partitioned, I don't think it's a terribly hard
>>problem. That is, with proper partitioning the pieces can be
>>recombined, even if the implementations aren't general enough for all
>>cases. Apache and Zope 2 authentication being examples where the
>>partitioning was done improperly.
>
>
> Yes. I think it goes further than that. For example, I'd like to have
> be able to swap out implementations of the following kinds of components
> at a level somewhere above my application:
>
> Sessioning
Yes; we need a standard interface for sessions, but that's pretty
straight-forward. There's other levels where a useful standard can be
implemented as well; for instance, flup.middleware.session has
SessionStore, which is where most of the parts of the session that you'd
want to reimplement are implemented.
> Authentication/identification
This seems very doable right now, just by using SCRIPT_NAME. This leads
to rather dumb users -- just a string -- but it's a good
lowest-common-denominator starting point. More interesting interfaces
-- like lists of roles/groups, or user objects -- can be added on
incrementally.
> Authorization (via something like declarative security based on a path)
Sure; I can imagine a whole slew of ways to do authorization. An
application can do it simply by returning 403 Forbidden. A front-end
middleware could do it with simple pattern matching on the URL. A URL
parser (aka traversal) can look for security annotations.
> Virtual hosting awareness
I've never had a problem with this, except in Zope...
Anyway, to me this feels like a kind of URL parsing. One of the
mini-proposals I made before involved a way of URL parsers to add URL
variables to the system (basically a standard WSGI key to put URL
variables as a dictionary). So a pattern like:
(?<username>.*)\.myblogspace.com/(?<year>\d\d\d\d)/(?<month>\d\d)/
Would add username, year, and month variables to the system. But regex
matching is just one way; the *result* of parsing is usually either in
the object (e.g., you use domains to get entirely different sites), or
in terms of these variables.
> View lookup
> View invocation
This I imagine happening either below WSGI entirely, or as part of a URL
parser. There's certainly a place for adaptation at different stages.
For instance, paste.urlparser.URLParser.get_application() clearly is
ripe for adaptation. I imagine this wrapping the "resource" with
something that renders it using a view. If you make resources and views
-- lots of (most?) frameworks use controllers and views, and view lookup
tends to be controller driven. So it feels very framework-specific to me.
> Transformation during rendering
If you mean what I think -- e.g., rendering XSL -- I think WSGI is ripe
for this sort of thing. So far I've just done small things, like HTML
checking, debugging log messages, etc. But other things are very possible.
> Caching
Again, I think this is a very natural fit. Well, at least for
whole-page caching. Partial page caching doesn't really fit well at
all, I'm afraid, though both systems could use the same caching backend.
> Essentially, as Phillip divined, to do so, I've been trying to construct
> a framework-neutral component system out of middleware pieces to do so,
> but maybe I need to step back from that a bit. It sure is tempting,
> though. ;-)
I've found it satisfyingly easy. Maybe there's a "better" way... but
"better" without "easier" doesn't excite me at all. And we learn best
by doing... which is my way of saying you should try it with code right
now ;)
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Web-SIG
mailing list