[Web-SIG] PEP 444 Goals

Fri Jan 7 03:36:32 CET 2011

07.01.2011 04:09, Graham Dumpleton kirjoitti:
> 2011/1/7 Graham Dumpleton<graham.dumpleton at gmail.com>:
>> 2011/1/7 Alex Grönholm<alex.gronholm at nextday.fi>:
>>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>>
>>> One other comment about HTTP/1.1 features.
>>>
>>> You will always be battling to have some HTTP/1.1 features work in a
>>> controllable way. This is because WSGI gateways/adapters aren't often
>>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>>> the modules implementing those protocols do, or even are hamstrung by
>>> how those protocols work.
>>>
>>> The classic example is 100-continue processing. This simply cannot
>>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>>> mechanisms where proxying is performed as the protocol being used
>>> doesn't implement a notion of end to end signalling in respect of
>>> 100-continue.
>>>
>>> I think we need some concrete examples to figure out what is and isn't
>>> possible with WSGI 1.0.1.
>>> My motivation for participating in this discussion can be summed up in that
>>> I want the following two applications to work properly:
>>>
>>> - PlasmaDS (Flex Messaging implementation)
>>> - WebDAV
>>>
>>> The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS.
>>> Interoperability with the existing implementation requires that both the
>>> request and response use chunked transfer encoding, to achieve bidirectional
>>> streaming. I don't really care how this happens, I just want to make sure
>>> that there is nothing preventing it.
>> That can only be done by changing the rules around wsgi.input is used.
>> I'll try and find a reference to where I have posted information about
>> this before, otherwise I'll write something up again about it.
> BTW, even if WSGI specification were changed to allow handling of
> chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
> mod_wsgi daemon mode. Also not likely to work on uWSGI either.
>
> This is because all of these work on the expectation that the complete
> request body can be written across to the separate application process
> before actually reading the response from the application.
>
> In other words, both way streaming is not possible.
>
> The only solution which would allow this with Apache is mod_wsgi
> embedded mode, which in mod_wsgi 3.X already has an optional feature
> which can be enabled so as to allow you to step out of current bounds
> of the WSGI specification and use wsgi.input as I will explain, to do
> this both way streaming.
>
> Pure Python HTTP/WSGI servers which are a front facing server could
> also be modified to handle this is WSGI specification were changed,
> but whether those same will work if put behind a web proxy will depend
> on how the front end web proxy works.
Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?
> Graham
>
>>> The WebDAV spec, on the other hand, says
>>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>>
>>> The 102 (Processing) status code is an interim response used to inform the
>>> client that the server has accepted the complete request, but has not yet
>>> completed it. This status code SHOULD only be sent when the server has a
>>> reasonable expectation that the request will take significant time to
>>> complete. As guidance, if a method is taking longer than 20 seconds (a
>>> reasonable, but arbitrary value) to process the server SHOULD return a 102
>>> (Processing) response. The server MUST send a final response after the
>>> request has been completed.
>> That I don't offhand see a way of being able to do as protocols like
>> SCGI and CGI definitely don't allow interim status. I am suspecting
>> that FASTCGI and AJP don't allow it either.
>>
>> I'll have to even do some digging as to how you would even handle that
>> in Apache with a normal Apache handler.
>>
>> Graham
>>
>>> Again, I don't care how this is done as long as it's possible.
>>>
>>> The current WSGI specification acknowledges that by saying:
>>>
>>> """
>>> Servers and gateways that implement HTTP 1.1 must provide transparent
>>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>>> in any of several ways:
>>>
>>> * Respond to requests containing an Expect: 100-continue request with
>>> an immediate "100 Continue" response, and proceed normally.
>>> * Proceed with the request normally, but provide the application with
>>> a wsgi.input stream that will send the "100 Continue" response if/when
>>> the application first attempts to read from the input stream. The read
>>> request must then remain blocked until the client responds.
>>> * Wait until the client decides that the server does not support
>>> expect/continue, and sends the request body on its own. (This is
>>> suboptimal, and is not recommended.)
>>> """
>>>
>>> If you are going to try and push for full visibility of HTTP/1.1 and
>>> an ability to control it at the application level then you will fail
>>> with 100-continue to start with.
>>>
>>> So, although option 2 above would be the most ideal and is giving the
>>> application control, specifically the ability to send an error
>>> response based on request headers alone, and with reading the response
>>> and triggering the 100-continue, it isn't practical to require it, as
>>> the majority of hosting mechanisms for WSGI wouldn't even be able to
>>> implement it that way.
>>>
>>> The same goes for any other feature, there is no point mandating a
>>> feature that can only be realistically implementing on a minority of
>>> implementations. This would be even worse where dependence on such a
>>> feature would mean that the WSGI application would no longer be
>>> portable to another WSGI server and destroys the notion that WSGI
>>> provides a portable interface.
>>>
>>> This isn't just restricted to HTTP/1.1 features either, but also
>>> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
>>> that are directly hooked into the URL parsing of the base HTTP server
>>> can provide that information, which basically means that only pure
>>> Python HTTP/WSGI servers are likely able to provide it without
>>> guessing, and in that case such servers usually are always used where
>>> WSGI application mounted at root anyway.
>>>
>>> Graham
>>>
>>> On 7 January 2011 09:29, Graham Dumpleton<graham.dumpleton at gmail.com>
>>> wrote:
>>>
>>> On 7 January 2011 08:56, Alice Bevan–McGregor<alice at gothcandy.com>  wrote:
>>>
>>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>>
>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>
>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
>>> per the HTTP 1.1 RFC.
>>>
>>> Requirements on the HTTP compliance of the server don't really have any
>>> place in the WSGI spec. You should be able to be WSGI compliant even if you
>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>> requests via SCGI).
>>> The original spec got this right: chunking etc are something which is not
>>> relevant to the wsgi application code -- it is up to the server to implement
>>> the HTTP transport according to the HTTP spec, if it's purporting to be an
>>> HTTP server.
>>>
>>> Chunking is actually quite relevant to the specification, as WSGI and PEP
>>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
>>> chunked bodies regardless of higher-level support for chunking.  The body
>>> iterator.  Previously you /had/ to define a length, with chunked encoding at
>>> the server level, you don't.
>>>
>>> I agree, however, that not all gateways will be able to implement the
>>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
>>> seems to support it as well. I should re-word it as:
>>>
>>> "For those servers capable of HTTP/1.1 features the implementation of such
>>> features is required."
>>>
>>> I would question whether FASTCGI, SCGI or AJP support the concept of
>>> chunking of responses to the extent that the application can prepare
>>> the final content including chunks as required by the HTTP
>>> specification. Further, in Apache at least, the output from a web
>>> application served via those protocols is still pushed through the
>>> Apache output filter chain so as to allow the filters to modify the
>>> response, eg., apply compression using mod_deflate. As a consequence,
>>> the standard HTTP 'CHUNK' output filter is still a part of the output
>>> filter stack. This means that were a web application to try and do
>>> chunking itself, then Apache would rechunk such that the original
>>> chunking became part of the content, rather than the transfer
>>> encoding.
>>>
>>> So, in order to be able to achieve what I think you want, with a web
>>> application being able to do chunking itself, you would need to modify
>>> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
>>> also like mod_cgi and mod_cgid of Apache.
>>>
>>> The only WSGI implementation I know of for Apache where you might even
>>> be able to do what you want is uWSGI. This is because I believe from
>>> memory it uses a mode in Apache by default called assbackwords. What
>>> this allows is for the output from the web application to bypass the
>>> Apache output filter stack and directly control the raw HTTP output.
>>> This gives uWSGI a little bit less overhead in Apache, but at the loss
>>> of the ability to actually use Apache output filters and for Apache to
>>> fix up response headers in any way. There is a flag in uWSGI which can
>>> optionally be set to make it use the more traditional mode and not use
>>> assbackwords.
>>>
>>> Thus, I believe you would be fighting against server implementations
>>> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
>>> chunking to be supported at the level of the web application.
>>>
>>> About all you can do is ensure that the WSGI specification doesn't
>>> include anything in it which would prevent a web application
>>> harnessing indirectly such a feature as chunking where the web server
>>> supports it.
>>>
>>> As it is, it isn't chunked responses which is even the problem,
>>> because if a underlying web server supports chunking for responses,
>>> all you need to do is not set the content length.
>>>
>>> The problem area with chunking is the request content as the way that
>>> the WSGI specification is written prevents being able to have chunked
>>> request content. I have described the issue previously and made
>>> suggestions about alternate way that wsgi.input could be used.
>>>
>>> Graham
>>>
>>> +1
>>>
>>>         - Alice.
>>