[Web-SIG] HTTP 1.1 Expect/Continue handling

Wed Jan 30 00:01:38 CET 2008

On 29/01/2008, James Y Knight <foom at fuhm.net> wrote:
>
> On Jan 29, 2008, at 1:36 AM, Brian Smith wrote:
>
> > 1. The WSGI gateway must send the response headers immediately when
> > the application yields its first non-empty string.
> >
> > 2. When there is an "100-continue" token in the request "Expect:"
> > header, the WSGI gateway is allowed to delay sending the "100
> > Continue" response until the application reads from
> > environ["wsgi.input"].
> >
> > Consequently, if there is a 100-continue expectation, then a WSGI
> > application must not read from wsgi.input after yielding its first
> > non-empty string.
> >
> > For example, the following application results in undefined
> > (probably erroneous) behavior:
> >
> >       def application(environ, start_response):
> >               start_response("400 Bad Request", [])
> >               yield "400 Bad Request"
> >               environ["wsgi.input"].read(1)
>
> Agreed, this is ambiguous in the WSGI specs. However, there is a
> mitigating factor:
>
> The above example should not cause misbehavior when talking to well-
> designed clients. Clients are basically required to always send the
> request body, whether or not a 100-continue arrives, unless the
> connection gets closed, in order to work with older and misdesigned
> servers. They may delay a bit, to see if the server will close the
> connection, but otherwise ought to start sending the request body in
> any case.
>
> However, this omission in the WSGI spec does allow for violation of
> the HTTP RFC:
> > Upon receiving a request which includes an Expect request-header
> > field with the "100-continue" expectation, an origin server MUST
> > either respond with 100 (Continue) status and continue to read from
> > the input stream, or respond with a final status code. The origin
> > server MUST NOT wait for the request body before sending the 100
> > (Continue) response. If it responds with a final status code, it MAY
> > close the transport connection or it MAY continue to read and
> > discard the rest of the request. It MUST NOT perform the requested
> > method if it returns a final status code.
>
> If you changed your example to start_response("200 OK", []), that
> would violate the "MUST NOT perform the requested method" clause.
>
> I see three ways to resolve this:
>
> a) One is to clarify this as a requirement upon the WSGI gateway.
> Something like the following:
> "If the client requests Expect: 100-continue, and the application
> yields data before reading from the input, and the response code is a
> success (2xx) code, then the gateway MUST send a 100 continue
> response, before writing any other response headers in order to comply
> with RFC 2616 §8.2.3 and to allow the WSGI application to read from
> the input stream later on in request processing".
>
> This should handle most real-world cases. Now, only sending 100 when
> the response code is 2xx may be potentially a bit fragile, and won't
> help e.g. your dummy app above. (maybe some real app really did want
> the input data even for an error response too?). But, on the other
> hand, you really *don't* want to force the transmission of a 100
> continue when the server is sending e.g. a "400 Bad Request" response
> and likely won't ever read input data.
>
> b) Alternatively, the WSGI gateway could raise an exception when you
> attempt to respond with a success code without having read the input.
> This also satisfies RFC2616's prohibition against a successful
> execution of the request without a 100 continue response, but seems to
> me more likely to break things than help them, so I'd say (a) is
> strictly better.
>
> c) Another option is to clarify this as a requirement for a WSGI
> application: "An application must not read from wsgi.input after
> yielding its first non-empty string unless it has already read from
> wsgi.input before having yielded its first non-empty string.
> (environ["wsgi.input"].read(0) may be used to indicate the desire to
> read the input in the future and satisfy this requirement, without
> actually reading any data.)"

A clarification in the specification may be required to the extent of
saying that where a zero length read is done, that no WSGI middleware
which wraps wsgi.input, nor even the WSGI adapter itself may optimise
it away. In other words a zero length read must always be passed
through unless specifically not appropriate for what the WSGI
middleware is doing.

This would be required to ensure that zero length read always
propagates down to the web server layer itself such that it may
trigger the 100-continue.

This requirement would probably exist independent of (c) being used as
a solution.

Graham

> The way I see it, (a) is not a change in the spec, but just a
> clarification. The combination of the current spec and HTTP RFC imply
> that you should do that already, in order to not violate 2616
> (although it's quite likely nobody actually is, not having realized
> the requirement). (b) on the other hand, is truly a change in the
> spec, but is a bit theoretically cleaner.
>
> > Should the application be able to detect whether there is a "100-
> > continue" token in the Expect header of the request?
>
> No.
>
> > Or, is the WSGI gateway allowed/required to hide the token?
>
> Allowed.
>
> > Another consequence is that an application cannot explicitly respond
> > with a "100 Continue" itself, like this:
> >
> >      def application(environ, start_response):
> >               start_response("100 Continue", [])
> >               yield ""
> >               start_response("200 OK", [])
> >               yield "OK"
> >
> > The reasons is that start_response cannot be called twice except
> > when an exception is detected, and also the "100 Continue" would not
> > be sent until right before the "200 OK" was sent anyway.
>
> That's not really a consequence of the above discussion, but, yes,
> that's true.
>
> James
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>