[Web-SIG] Latest WSGI Draft

Mon Aug 23 00:41:31 CEST 2004

Phillip J. Eby wrote:
> At 02:18 PM 8/22/04 -0500, Ian Bicking wrote:
> 
>> Phillip J. Eby wrote:
>>
>>>
>>> Do you have a specific suggestion here?
>>
>>
>> Use only the term "server".
> 
> 
> I'm rather reluctant to do that, because CGI, FastCGI, and many other 
> such systems are "gateways" rather than servers per se.  Technically, I 
> would only consider a web server that's written in Python, or embeds 
> Python, to be capable of being a "server" per the spec.  Other servers 
> must be accessed via a "gateway" written in Python.  Certainly it 
> doesn't make sense to talk about a CGI "server", for example.

Okay, that's fine then.

>>>>   Really, it's for CGI and nothing else.  Maybe just wsgi.cgi?
>>>> wsgi.run_once?  I think the semantics shouldn't be any more general 
>>>> than that.  Then we can also guarantee that it won't be called again.
>>>
>>>
>>> I'm really reluctant to require the server to make such a guarantee.  
>>> My understanding of your use case is really more like, "I'm not 
>>> likely to run you again for a while, so don't optimize for frequent 
>>> execution."
>>> Hm.  Now that I'm thinking about it more, it seems to me that this 
>>> could be just as easily handled by application/framework-side 
>>> configuration, and I'm inclined to remove it from the spec altogether.
>>
>>
>> That was initially how multithreaded and multiprocess was going to be 
>> handled too, but I think it's really important that those will be 
>> specified.  CGI is the only realistic use case for this feature, but 
>> it's a really common use case (since it's really just a widely 
>> supported standard that we are building on), and it presents a 
>> distinct set of problems for Python.  I don't see any reason not to 
>> just be explicit about being in a CGI environment -- every server will 
>> clearly know if it's in a CGI environment, every application can 
>> ignore it if it chooses, everyone will know exactly what it means in 
>> the spec.
> 
> 
> Alright.  Let's make it 'wsgi.run_once'.  Here's my attempt at a shorter 
> explanation:
> 
> ``wsgi.run_once``      This value should be true if the server/gateway
>                        expects (but does not guarantee!) that the
>                        application will only be invoked this one time
>                        during the life of its containing process.
>                        Normally, this will only be true for a gateway
>                        based on CGI (or something similar).

Is there a reason it can't be guaranteed?

>>> You're right.  The extension mechanism needs to be clearer.  Instead 
>>> of throwing away everything, there needs to be a way to identify that 
>>> a server-supplied value may be used in place of some WSGI 
>>> functionality, so that middleware can remove only those items, rather 
>>> than every item.
>>> Hmmm.  Maybe we should have a 'wsgi.extensions' key that contains a 
>>> dictionary for items that middleware *must* either understand, or not 
>>> pass through.  If a framework or middleware author did your 
>>> hypothetical query string parsing, he would have to place it in 
>>> 'wsgi.extensions' if he did not implement the cross-check you describe.
>>
>>
>> I'm quite comfortable with solving this in on ad hoc basis.  Generally 
>> the issue is middleware that rewrites the environment, but some 
>> extension depends on a value in the environment and isn't 
>> simultaneously updated.  In general, keeping a note about what the 
>> value of the key was will work fine, in those small number of cases 
>> where it is an issue. Then it's up to the extension-using application 
>> (and middleware) to agree on a reliable way to do things, and other 
>> pieces of middleware don't need to worry about any of it.
>>
>> I guess the problem is that someone might build in a dependency, but 
>> not be careful about it, and bugs would only arise in the presence of 
>> some middleware which the author didn't test with.  It's the same 
>> issue if the author doesn't set wsgi.extensions properly, though 
>> that's more explicit and maybe harder to miss.
> 
> 
> Here's the use case I'm thinking of.  Suppose mod_python wants to expose 
> some nifty super-duper API that an application can use in place of pure 
> WSGI, if it's present.  But, this interface maybe bypasses certain 
> features that a particular piece of middleware is intended to 
> intercept.  So, my idea here is that if mod_python puts that API into a 
> key in 'wsgi.extensions', then any middleware will know it's safely 
> "intercepting communications" if it discards any 'wsgi.extensions'.
> 
> This is different from the sort of scenario you're talking about, where 
> you can have cached data include a record of its dependencies to ensure 
> correctness.
> 
> So here's the idea:
> 
>  * If you provide an alternative mechanism or extension to a 
> WSGI-supplied facility, you place it in the 'wsgi.extensions' dictionary
> 
>  * If you're middleware that simply adds additional data to the 
> 'environ', do so, recording your dependencies if any, to avoid becoming 
> "stale" if other middleware changes things
> 
>  * If you're middleware that makes changes to existing variables, or 
> intercepts any WSGI operations, do 'environ["wsgi.extensions"].clear()' 
> or delete any extensions you can't intercept, to prevent the underlying 
> application from "going around" you.
> 
> Your thoughts?

Okay, that seems reasonable.  For instance, I could imagine mod_python 
putting its Apache request object in an extension.  Something like an 
exception-catching middleware wouldn't really care about this sort of 
thing, so it wouldn't clear the extensions, but a middleware that 
filtered the output wouldn't want that extension around.

I guess a general rule would be that any extension that provided a route 
around input/output should be in wsgi.extensions, and any middleware 
that relies on input and output should clear those extensions.  Should 
that rule also apply to the other environmental variables?

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org