[Web-SIG] Latest WSGI Draft
Ian Bicking
ianb at colorstudy.com
Mon Aug 23 00:41:31 CEST 2004
Phillip J. Eby wrote:
> At 02:18 PM 8/22/04 -0500, Ian Bicking wrote:
>
>> Phillip J. Eby wrote:
>>
>>>
>>> Do you have a specific suggestion here?
>>
>>
>> Use only the term "server".
>
>
> I'm rather reluctant to do that, because CGI, FastCGI, and many other
> such systems are "gateways" rather than servers per se. Technically, I
> would only consider a web server that's written in Python, or embeds
> Python, to be capable of being a "server" per the spec. Other servers
> must be accessed via a "gateway" written in Python. Certainly it
> doesn't make sense to talk about a CGI "server", for example.
Okay, that's fine then.
>>>> Really, it's for CGI and nothing else. Maybe just wsgi.cgi?
>>>> wsgi.run_once? I think the semantics shouldn't be any more general
>>>> than that. Then we can also guarantee that it won't be called again.
>>>
>>>
>>> I'm really reluctant to require the server to make such a guarantee.
>>> My understanding of your use case is really more like, "I'm not
>>> likely to run you again for a while, so don't optimize for frequent
>>> execution."
>>> Hm. Now that I'm thinking about it more, it seems to me that this
>>> could be just as easily handled by application/framework-side
>>> configuration, and I'm inclined to remove it from the spec altogether.
>>
>>
>> That was initially how multithreaded and multiprocess was going to be
>> handled too, but I think it's really important that those will be
>> specified. CGI is the only realistic use case for this feature, but
>> it's a really common use case (since it's really just a widely
>> supported standard that we are building on), and it presents a
>> distinct set of problems for Python. I don't see any reason not to
>> just be explicit about being in a CGI environment -- every server will
>> clearly know if it's in a CGI environment, every application can
>> ignore it if it chooses, everyone will know exactly what it means in
>> the spec.
>
>
> Alright. Let's make it 'wsgi.run_once'. Here's my attempt at a shorter
> explanation:
>
> ``wsgi.run_once`` This value should be true if the server/gateway
> expects (but does not guarantee!) that the
> application will only be invoked this one time
> during the life of its containing process.
> Normally, this will only be true for a gateway
> based on CGI (or something similar).
Is there a reason it can't be guaranteed?
>>> You're right. The extension mechanism needs to be clearer. Instead
>>> of throwing away everything, there needs to be a way to identify that
>>> a server-supplied value may be used in place of some WSGI
>>> functionality, so that middleware can remove only those items, rather
>>> than every item.
>>> Hmmm. Maybe we should have a 'wsgi.extensions' key that contains a
>>> dictionary for items that middleware *must* either understand, or not
>>> pass through. If a framework or middleware author did your
>>> hypothetical query string parsing, he would have to place it in
>>> 'wsgi.extensions' if he did not implement the cross-check you describe.
>>
>>
>> I'm quite comfortable with solving this in on ad hoc basis. Generally
>> the issue is middleware that rewrites the environment, but some
>> extension depends on a value in the environment and isn't
>> simultaneously updated. In general, keeping a note about what the
>> value of the key was will work fine, in those small number of cases
>> where it is an issue. Then it's up to the extension-using application
>> (and middleware) to agree on a reliable way to do things, and other
>> pieces of middleware don't need to worry about any of it.
>>
>> I guess the problem is that someone might build in a dependency, but
>> not be careful about it, and bugs would only arise in the presence of
>> some middleware which the author didn't test with. It's the same
>> issue if the author doesn't set wsgi.extensions properly, though
>> that's more explicit and maybe harder to miss.
>
>
> Here's the use case I'm thinking of. Suppose mod_python wants to expose
> some nifty super-duper API that an application can use in place of pure
> WSGI, if it's present. But, this interface maybe bypasses certain
> features that a particular piece of middleware is intended to
> intercept. So, my idea here is that if mod_python puts that API into a
> key in 'wsgi.extensions', then any middleware will know it's safely
> "intercepting communications" if it discards any 'wsgi.extensions'.
>
> This is different from the sort of scenario you're talking about, where
> you can have cached data include a record of its dependencies to ensure
> correctness.
>
> So here's the idea:
>
> * If you provide an alternative mechanism or extension to a
> WSGI-supplied facility, you place it in the 'wsgi.extensions' dictionary
>
> * If you're middleware that simply adds additional data to the
> 'environ', do so, recording your dependencies if any, to avoid becoming
> "stale" if other middleware changes things
>
> * If you're middleware that makes changes to existing variables, or
> intercepts any WSGI operations, do 'environ["wsgi.extensions"].clear()'
> or delete any extensions you can't intercept, to prevent the underlying
> application from "going around" you.
>
> Your thoughts?
Okay, that seems reasonable. For instance, I could imagine mod_python
putting its Apache request object in an extension. Something like an
exception-catching middleware wouldn't really care about this sort of
thing, so it wouldn't clear the extensions, but a middleware that
filtered the output wouldn't want that extension around.
I guess a general rule would be that any extension that provided a route
around input/output should be in wsgi.extensions, and any middleware
that relies on input and output should clear those extensions. Should
that rule also apply to the other environmental variables?
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Web-SIG
mailing list