[Web-SIG] Latest WSGI Draft
Phillip J. Eby
pje at telecommunity.com
Mon Aug 23 00:14:43 CEST 2004
At 02:18 PM 8/22/04 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>Do you have a specific suggestion here?
>Use only the term "server".
I'm rather reluctant to do that, because CGI, FastCGI, and many other such
systems are "gateways" rather than servers per se. Technically, I would
only consider a web server that's written in Python, or embeds Python, to
be capable of being a "server" per the spec. Other servers must be
accessed via a "gateway" written in Python. Certainly it doesn't make
sense to talk about a CGI "server", for example.
>running start_response in __iter__ seems strange to me. Maybe it's
>correct, but I expect the call sequence to be:
> start_response(status_code, environ) returns write()
> possible write() calls
>application returns iterable
>server uses iterable
>In this example, the write() function only is created after you start the
>iteration. Maybe that's fine, I'm not sure -- it's a little odd, because
>when you start the iteration you expect to be getting the body, but the
>headers haven't been sent yet. Of course, you ensure the headers get
>sent, but it definitely confuses me.
Darn. I guess now I'll have to explain this part, too. :) The intent of
the spec is to allow start_response() to be called during the first
iteration of the iterator. That is, you must have called start_response()
at least by the time the first body part is yielded from the iterator.
I illustrated this in the example, but forgot to mention it in the
text. I'm correcting this now.
>>> Really, it's for CGI and nothing else. Maybe just wsgi.cgi?
>>>wsgi.run_once? I think the semantics shouldn't be any more general than
>>>that. Then we can also guarantee that it won't be called again.
>>I'm really reluctant to require the server to make such a guarantee. My
>>understanding of your use case is really more like, "I'm not likely to
>>run you again for a while, so don't optimize for frequent execution."
>>Hm. Now that I'm thinking about it more, it seems to me that this could
>>be just as easily handled by application/framework-side configuration,
>>and I'm inclined to remove it from the spec altogether.
>That was initially how multithreaded and multiprocess was going to be
>handled too, but I think it's really important that those will be
>specified. CGI is the only realistic use case for this feature, but it's
>a really common use case (since it's really just a widely supported
>standard that we are building on), and it presents a distinct set of
>problems for Python. I don't see any reason not to just be explicit about
>being in a CGI environment -- every server will clearly know if it's in a
>CGI environment, every application can ignore it if it chooses, everyone
>will know exactly what it means in the spec.
Alright. Let's make it 'wsgi.run_once'. Here's my attempt at a shorter
``wsgi.run_once`` This value should be true if the server/gateway
expects (but does not guarantee!) that the
application will only be invoked this one time
during the life of its containing process.
Normally, this will only be true for a gateway
based on CGI (or something similar).
>>You're right. The extension mechanism needs to be clearer. Instead of
>>throwing away everything, there needs to be a way to identify that a
>>server-supplied value may be used in place of some WSGI functionality, so
>>that middleware can remove only those items, rather than every item.
>>Hmmm. Maybe we should have a 'wsgi.extensions' key that contains a
>>dictionary for items that middleware *must* either understand, or not
>>pass through. If a framework or middleware author did your hypothetical
>>query string parsing, he would have to place it in 'wsgi.extensions' if
>>he did not implement the cross-check you describe.
>I'm quite comfortable with solving this in on ad hoc basis. Generally the
>issue is middleware that rewrites the environment, but some extension
>depends on a value in the environment and isn't simultaneously
>updated. In general, keeping a note about what the value of the key was
>will work fine, in those small number of cases where it is an issue. Then
>it's up to the extension-using application (and middleware) to agree on a
>reliable way to do things, and other pieces of middleware don't need to
>worry about any of it.
>I guess the problem is that someone might build in a dependency, but not
>be careful about it, and bugs would only arise in the presence of some
>middleware which the author didn't test with. It's the same issue if the
>author doesn't set wsgi.extensions properly, though that's more explicit
>and maybe harder to miss.
Here's the use case I'm thinking of. Suppose mod_python wants to expose
some nifty super-duper API that an application can use in place of pure
WSGI, if it's present. But, this interface maybe bypasses certain features
that a particular piece of middleware is intended to intercept. So, my
idea here is that if mod_python puts that API into a key in
'wsgi.extensions', then any middleware will know it's safely "intercepting
communications" if it discards any 'wsgi.extensions'.
This is different from the sort of scenario you're talking about, where you
can have cached data include a record of its dependencies to ensure
So here's the idea:
* If you provide an alternative mechanism or extension to a WSGI-supplied
facility, you place it in the 'wsgi.extensions' dictionary
* If you're middleware that simply adds additional data to the 'environ',
do so, recording your dependencies if any, to avoid becoming "stale" if
other middleware changes things
* If you're middleware that makes changes to existing variables, or
intercepts any WSGI operations, do 'environ["wsgi.extensions"].clear()' or
delete any extensions you can't intercept, to prevent the underlying
application from "going around" you.
More information about the Web-SIG