[Web-SIG] Stuff left to be done on WSGI
Phillip J. Eby
pje at telecommunity.com
Sat Aug 28 05:13:43 CEST 2004
At 07:00 PM 8/27/04 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>I don't know if it's possible for us to get these items together in time
>>for 2.4; if we don't, we don't.
>I can't imagine we would make it.
You're probably right; it's just so tantalizingly close, as AMK mentioned.
>I would hope that we can come to some consensus and produce something
>useable before 2.5, with the understanding that it will be included in
>2.5. I would kind of like to see a "web" package.
I think we'll have better luck with a 'wsgi' package, but I could be
wrong. 'web' just seems like a nuisance attractor for all sorts of
unproductive bickering on so many levels.
On a more immediate practical level, we'd be crazy to try to claim 'web'
for a third-party package that we want to propose for the stdlib, but a
package named 'wsgi' would be more than fair game.
>>There's little harm in having a separate 'wsgi' distribution until 2.5
>>rolls around. I'm thinking the package should include:
>> * BaseHTTPServer-based WSGI server
>> * CGI-based WSGI gateway (run WSGI apps under CGI)
>You've noted these are missing error handling. What kind were you
>thinking of specifically?
>There's exception handling, which seems straight forward.
Well, to be honest, I haven't a clue what one does about errors *after* the
headers are written. You can't send anything useful to the client, because
the status is already set.
If you sent a Content-Length, you can break the connection before that
point, and it's a fair guess the client will know something's wrong. If
you *didn't* send a content length and break the connection, the client
gets an incomplete file and maybe doesn't know it. Sending an error
message once 'write()' has been called will garble the output.
All of these options are especially unsatisfactory when binary files are
involved, where "unsatisfactory" could mean anything from "annoying" to
"catastrophic" (e.g. garbling an executable).
> Spec compliance? Certainly an anal version of these servers should be
> written, that checks every type passed around, looks for common mistakes,
> etc. I don't know if the anal and the useable version need to be the
> same thing.
I wasn't even addressing spec compliance, although test suites for all the
implementations, factored so that they could be used as a basis for testing
other implementations, would certainly be nice.
>Two models -- one that optimistically tries to load the cgi module in a
>fake environment (what I did), plus another that actually runs any CGI script.
I'm not following what the difference is, exactly, but I guess we'll need
to get into the design more.
>If we use email.Message, using a status header seems fine. If not, I
>think it should be separate -- I don't want to search a list for the
Right, that's all I was thinking.
>I don't think the utility functions are a big deal at all, and I worry
>that there's some gotchas to email.Message, specifically where it is
>intended for email. So I'm certainly not adamantly opposed to
>email.Message, but I'm not adamantly for it either. I'd rather see a
>superclass of email.Message (such a superclass does not yet exist, but
>should be easy to write/extract) that is more minimal.
Why don't you take a look at the code? I have. Here are the methods:
as_string, __str__ -- format the message as a string
is_multipart -- returns true if payload has been set to a list
get_charsets, walk -- stuff for manipulating parts of the message we don't
set_charset/get_charset -- sets the character set parameters of the
content-type, which is actually useful. On the down side, setting the
character set sets MIME-Version, but it also sets the
Content-Transfer-Encoding, so it doesn't force the server to default one.
__len__, __getitem__, __setitem__, __delitem__, __contains__, has_key, get,
keys, values, items -- case-insensitive dictionary-like interface (i.e.,
the stuff we mainly want)
get_all -- all values for a header name
add_header, replace_header -- more stuff we want
get_type, get_main_type, get_subtype, get_content_type,
get_content_maintype, get_content_subtype, get_content_subtype, get_param,
get_params, set_param, del_param, set_type, get_boundary, set_boundary,
get_content_charset -- miscellaneous content-type analysis and
manipulation. Not necessarily very helpful, except maybe for
middleware. But they hardly hurt.
get_filename -- extract filename from Content-Disposition if present. Not
particularly helpful, but also not damaging in any way.
Perhaps more eyes should look at this, but I haven't found anything in here
that's damaging or even annoying apart from setting MIME-Version if it's
not there and the content-type is touched.
>But, I don't know. I'm still up in the air. Really, I just don't like
>wrapping start_response, from a mechanical point of view. It feels
>awkward to me. I wish I could just query the server as to what point in
>the response it is at.
Well, we could offer a facility for that, but first I'd like to explore
what error handling should *do* in different situations.
>>The only other thing that comes to mind is requiring servers to support
>>multiple 'start_response' calls in some way that makes sense for
>>exception handlers, while requiring it to still work in the case where an
>>extension API has already been used for output.
>That seems too hard.
Well, to some extent we have to look at the question of what should happen
in those circumstances anyway, whether we solve the problem in that
specific way or not. Because if the application *does* call start_response
more than once, the server has to be able to handle it *somehow*. Really,
the ultimate error handling *has* to be done by servers, unless they want
to take the route of crashing the entire process when something bad
More information about the Web-SIG