[Web-SIG] Possible specs

Sun Nov 12 03:05:12 CET 2006

Sorry, I've been out planting scuppernongs all day.

On 11/10/06, Ian Bicking <ianb at colorstudy.com> wrote:
> Luke Arno wrote:
> >> >>      * Debugging mode is something that can be used in all sorts of
> >> >> places; to increase verbosity, annotate output pages, displaying
> >> errors
> >> >> in the browser, etc. Having a single key for turning on debugging mode
> >> >> would allow its consumption in lots of places. Not as strict as
> >> >> authenticating.
> >> >
> >> > Maybe.
> >> >
> >> > wsgiorg.log_level? If you change the log level,
> >> > do you want this to effect the whole stack or
> >> > just your own stuff?
> >>
> >> I often do a bunch of stuff in debugging mode that is more aggressive
> >> than just log_level, and shouldn't be done at all in normal mode.  So a
> >> real on-off switch for debugging is useful.
> >
> > I expect your development (like mine) is a little atypical.
> > I am often working on various parts of the stack at the
> > same time. Most web developers are just working on
> > an app and would not like to see debugging info start
> > pouring out of everything under it. That is probably not
> > what you are suggesting. Still, I think that this is a low
> > value target for community standardization (though for
> > an individual who is all over the stack, it is a great
> > personal convention.)
>
> I find it fairly widely useful.  Right now I do
> environ['paste.config'].get('debug') to figure out if I'm in debug mode,
> and I find myself doing this at all sorts of levels, including directly
> in application code.
>
> Though one potential problem is if everyone starts listening for the
> same key, then the debugging information could become overwhelming.  In
> the context of the Paste Deploy entry point, I might generally do it like:
>
> class NoDefault: pass # marker
>
> def make_some_middleware(global_conf, debug=NoDefault, ...):
>      if debug is NoDefault:
>          debug = global_conf.get('debug')
>      debug = paste.deploy.converters.asbool(debug)
>      ...
>
> And maybe this is good enough.

I don't think someone just working on an app wants
that overhead. I bet most people would just want to
set log levels in a config file. The scoping of those
log levels _is_ case specific so perhaps we should
just say that reusable components should take
runtime configuration with respect to logging.
Framework and application authors can develop
their own schemes.

mw = MyMiddleWare(app, log_level=WHATEVER)

Anything beyond that requires a lot of assumptions
about how other people develop.

> >> >>      * Some systems prefer that unexpected exceptions bubble up, like
> >> >> test frameworks. A key could define this case (modelled on
> >> >> paste.throw_errors) and thus disable exception catchers.
> >> >
> >> > -1 I don't need it. Too abstract.
> >> >
> >> > Usually, you just have one error handling
> >> > middleware on the outside of the rest, no?
> >>
> >> Sometimes there's multiple pieces of middleware.  For instance, I might
> >> have one on the outside, with another one in a subapplication (which is
> >> independently deployable, so needs some kind of wrapper).  Also it is
> >> needed to avoid catching errors when running tests, where you want the
> >> exception to go up all the way to your test runner (which might have a
> >> debugger or special traceback formatter).
> >
> > This is probably just not relevant to my development
> > style. Or maybe I just don't get it.
>
> Well, the general pattern is:
>
> def error_catcher(app):
>      def replacement_app(environ, start_response):
>          if environ.get('paste.throw_errors'):
>              return app(environ, start_response)
>          try:
>              return app(environ, start_response)
>          except:
>              # handle and report on error
>      return replacement_app
>
> In a testing environment I always want errors to bubble up, I don't want
> a 500 response if it can be helped; so in paste.fixture.TestApp I always
> set this.
>
> It's not a big deal, but it's fairly simple to explain and handle.  Low
> hanging fruit.

I just put such things in a usually-outermost wrapping
context and test without it. I don't catch unexpected
exceptions at anyplace else in the stack so I guess
that is why this doesn't make sense to me.

innermost_app = App()
test_innermost(innermost_app) # expect exception
inner_app = SomeMiddleWare(innermost_app)
test_inner(inner_app) # expect different exception
outer_app = ErrorMiddleWare(inner_app)
test_outer(outer_app) # expect 500

Low hanging, perhaps, but not tasty to me :) I think
we are just doing things a little differently here. If I
am the only deviant then maybe your way should be
a standard?

> >> >>      * Logging is a tricky situation. The logging module allows for
> >> >> statically setting up logging systems, then configuring them at
> >> startup.
> >> >> This often isn't the best way to set up logging. Putting a
> >> >> logging.Logger instance right in the environment might be better. This
> >> >> requires some design and usage before setting on one spec.
> >> >
> >> > Maybe a lazy logger loader that takes the level
> >> > as an argument? Seems a little silly.
> >>
> >> I'm not really sure here, myself.  I know people ask for it, and
> >> sometimes I'm pretty sure I want something like this, but I don't feel
> >> very solid about what best practice in logging is.  I couldn't write
> >> this one.
> >
> > I am with you. I create loggers for their respective
> > scopes and call them as needed. Log levels go in
> > a config and that is the whole story. What else?
> >
> > I think people sometimes ask for structure because
> > they are unsure when all they need is to go for it.
>
> Well, often the container knows more about how logging should work than
> the thing itself knows.  Probably most of the time.  Passing around a
> logger helps with this.

Then I favor runtime configuration of the middleware
or application (or library or whatever)

mw = MiddleWare(app, logger=my_logger)

Framework and application authors can decide what
bits should share what loggers under what conditions.

> >> >>      * Thread-local values are a common technique in web frameworks,
> >> >> allowing global objects or functions to return request-specific
> >> >> information. This pattern could be codified into one core system,
> >> using
> >> >> some feedback from existing systems (which have their advantages and
> >> >> flaws).
> >> >
> >> > -1 but I don't like using thread locals for such things.
> >> >    Save the magic for things that need it. :)
> >>
> >> Thread local stuff can be a pain; I often curse it.  I think it can be
> >> okay with the right patterns.
> >
> > If we codify it, I think it will be used. A lot.
> >
> > I recently looked over a sort of a contact manager app
> > for somebody. There were metaclasses all over the
> > place. :)
>
> Personally I've come to believe that threadlocals should always be
> retrieved via a function call.  Threadlocal proxy objects just cause too
> much confusion.  But it's awfully nice to have access to them.  And for
> things like configuration I find it almost essential, unless you do
> process-wide configuration, which I abhor far more than threadlocal
> variables.

I don't find myself needing this but I believe you.
Is a wsgi.org spec the right place to standardize
this broader scoped pattern?

> >> The interface for apps is basically:
> >>
> >>    def app_factory(global_conf, **app_conf):
> >>        return wsgi app
> >>
> >> It's pretty neutral, with some notable details:
> >>
> >> - global_conf is basically inherited configuration.  It may or may not
> >> apply to this application.  The application factory can ignore it or not
> >> as it sees fit.  app_conf is definitely intended for this application;
> >> you can do more strict checking on it, make sure there's no extra values
> >> or that certain required values are present, etc.
> >>
> >> - all values can be strings, and strings should be handled specially.
> >> That is, if you expect an integer and you get a string, you should
> >> coerce it to an integer.  If you expect a list and you get a string, you
> >> should probably split the string in some fashion, or just turn it into a
> >> one-item list.  If you expect a boolean and you get a string, you should
> >> convert it intelligently (not based on the truth/falsiness of the string
> >> itself).
> >>
> >>
> >> That's it.  Middleware and servers have similar interfaces.  The server
> >> interface is a little under-powered (it just serves a WSGI application
> >> forever); it could be extended some, or left unspecified for now.
> >> Middleware takes as a first argument the application being wrapped.  Oh,
> >> and composites, which are a little harder -- they take as a first
> >> argument an object that can load more WSGI apps.  That's used for
> >> dispatchers that can direct to multiple subapplications.  That's more
> >> tightly coupled with Paste Deploy's app naming conventions, and it might
> >> be better to put explicit app loading into the configuration format and
> >> pass them in as keyword arguments to the dispatching app.
> >
> > This is all a bit meta for me. I guess I don't get it. :)
>
> It gives you a consistent way to configure WSGI stacks, from a config
> file, database, or whatever.

That much I do get. It seems clever but I have never
needed to do it. Outside of Paste and Paste users I always
see folks just building stacks in good old Python.

> >> >>      * A way to extend wsgiref.validate to add more validation, for
> >> all
> >> >> these new specs. (Probably this is an implementation, not a spec)
> >> >
> >> > That makes sense if our standards are that involved.
> >> > I don't see standardizable clarity on much that is so
> >> > complex.
> >>
> >> WSGI isn't all that complex, but validation is very helpful when people
> >> get it wrong.  It should be useful for any of these specs as well.  It
> >> also makes the spec much more explicitly clear, because computers check
> >> things more reliably than humans ;)
> >
> > I am not arguing against validation. I am just hoping
> > that it's superfluous in the near term.
>
> Testing is never superfluous ;)

Anything can be superfluous in a pure utility realm,
such as software. Those of us who fulfill ourselves
in the act of creating software can objectify that
satisfaction into its vehicles of utility and come to
mistake them as intrinsically valuable.

Pardon the philosophy. ;)

> >> >>      * Anchors for doing recursive calls, similar to paste.recursive.
> >> >> (paste.recursive is kind of an old module that is more complicated
> >> than
> >> >> it needs to be)
> >> >
> >> > Is that really such a common pattern? It is clever
> >> > but I have yet to find a case for it. Maybe I am just
> >> > overlooking something. What do you usually use
> >> > that for?
> >>
> >> Originally it was to support internal redirects and inclusion (which
> >> were part of the Webware API).  Now I find it useful for doing internal
> >> subrequests when using web-based APIs.
> >
> > That makes more sense. I have some forwarding
> > stuff like that in YARO because it can be used
> > to hide the WSGI interface a little. Other than that
> > I just call the thing I want to forward to. (I don't
> > like to use exceptions for normal flow control very
> > much, though it does come up.)
>
> I'm not a big fan of forwarding and using exceptions to unwrap the
> middleware.  But including content is much simpler.  I think it can be
> as simple as:
>
> def middleware(app):
>      def replacement(environ, start_response):
>          anchors = environ.setdefault(
>              'x-wsgiorg.app_anchors', {})
>          anchors[environ['SCRIPT_NAME']] = (app, environ.copy())
>          return app(environ, start_response)
>      return replacement
>
> This gives you the app and an indication of what the environ looks like
> when the app is typically reached.  From here you can implement
> recursive calls fairly easily.
>
> Whether there should be support for multiple anchors, I'm not sure.  I
> think it could be argued that the closest anchor is best to use, but the
> furthest one offers the most URI space (supposing there are multiple
> pieces of middleware like this in a stack).

Fancy. Confusing. Only works under the right
conditions, no? I think this is cool experimentation
but I just don't think it is a good candidate for
standardization.

> > Either way, finding what to forward to is the trick,
> > which is dispatch, and there are a whole lot of
> > ways to skin that cat.
>
> In this case there's no guessing, if you are selecting where to forward
> by URI.  This technique respects all dispatching.
>
>
> >> >>      * A place to put a database transaction manager
> >> >
> >> > -1 Way too specific.
> >>
> >> There'd have to be a database transaction manager standard first, anyway.
> >
> > I admire your ambition. :)
>
> I think it's reasonable, but something for db-sig and not web-sig.
>
>
> >> >>      * More user information than just REMOTE_USER; like
> >> >> wsgiorg.user_info?
> >> >
> >> > User objects are like request objects and I don't see
> >> > what we gain from making them fight for a key.
> >> >
> >> > Could this actually decrease real interop?
> >>
> >> This shouldn't be a real user object, more likely a dictionary, just
> >> like the WSGI request is a dictionary (environ) and not an object.
> >> Objects are hard to standardize ;)
> >
> > They are. I would be interested in this though I
> > remain somewhat skeptical.
>
> Right now we're doing this in a project where we want to support
> embedding the app in Zope or having it be stand-alone.  We want to keep
> it fairly loosely coupled from Zope, so we're trying not to pass in a
> user object.
>
> We haven't gone that far with it, but maybe later it'll feel more clear
> to me.  This isn't low hanging fruit.

Actually, this and the sessions, boring as they may be,
are the best candidates to grow convention on. (Maybe
standards are best when boring; well understood
problems are nice and boring.) Should this be tackled
by choosing a set of new keys for environ (to go beyond
REMOTE_USER with wsgiorg.user_roles and such) or
should there be one new key with a dict in it
(like wsgiorg.user_attribs)?

There are many existing specs for user attributes that
we could examine. The first step should probably be
for someone to document those. Given the proliferation
of such standards, I am still skeptical...

... but I am always skeptical ;)

Cheers,
- Luke