[Web-SIG] WSGI Content-Length issues.

Wed Jan 9 05:27:06 CET 2008

Can the group mind provide some clarification on the following please.

1. The WSGI specification does not require that a WSGI adapter provide
an EOF indicator if an attempt is made to read more data from
wsgi.input than defined by request Content-Length. Is though a WSGI
adapter required to explicitly discard any request content which
wasn't consumed or is the WSGI applications responsibility to ensure
that all request content up to the length specified is always
consumed?

I have seen some reports to suggest that some WSGI adapter/servers do
not discard unread content up to Content-Length, resulting in the
problem that if Keep-Alive was enabled that the server may incorrectly
try and interpret the remaining content as the header of the next
request on that same socket connection.

Some spam bots on the net which POST to arbitrary URLs are quite good
at triggering this scenario where WSGI applications don't consume
request content when they weren't expecting it.

If the WSGI specification isn't clear on the responsibilities of a
WSGI adapter to discard any request content that wasn't consumed then
any WSGI application to ensure it works on all hosting mechanisms
would have to ensure they always consume request content even if not
expected for a URL.

2. If a WSGI application sets a Content-Length in a response and then
returns request content of a greater length, should the WSGI adapter
attempt to discard any additional output beyond the length set by the
application or just pass it through? What obligations do WSGI
middleware have in this respect?

If the answer is that the WSGI adapter shouldn't care and should just
pass everything through, then would it be seen as at least prudent
that the WSGI adapter log a warning message that the returned response
content differs in length to the specified Content-Length? Same
applies where a WSGI application finished successfully but didn't
return as much output as it said it was going to.

3. Similarly, where a WSGI adapter supports wsgi.file_wrapper and the
Content-Length header was set in the response, should the WSGI adapter
send only at most that amount of data? This question applies whether
or not the WSGI adapter is able to optimise the sending of the
response because of the presence of fileno() or other platform
specific feature which would facilitate such optimisations.

4. Where a WSGI adapter supports wsgi.file_wrapper and the
Content-Length header was NOT set in the response, where optimisations
are being performed and the WSGI adapter can (or must in order to send
it) calculate the length of the output, can the WSGI adapter add its
own Content-Length header indicating the actual amount of response
content sent.

Graham