[Web-SIG] environ["wsgi.input"].read()

Sun Jan 27 20:47:03 CET 2008

Graham Dumpleton wrote:
> On 26/01/2008, Brian Smith <brian at briansmith.org> wrote:
> As to your questions about read() with no argument, or with 
> traditional Python file like object default of -1, the only 
> WSGI server/adapter I know of where this will NOT work as one 
> would expect, ie., read remainder of request content, is the 
> CherryPy WSGI adapter.
>
> As far as I know it works fine with Apache CGI WSGI adapters, 
> Apache mod_wsgi, plus SCGI, FASTCGI and AJP adapters via 
> flup, as well as with paste WSGI server. Not sure what 
> wsgiref will do though.

It doesn't work on mod_wsgi either. When I tried it, it only returned
8000 bytes of the input. That is why I started this thread in the first
place, actually. If this isn't the behavior you expected, I will file a
bug with a test case. (Google Code doesn't allow for attachments to bug
reports too, maybe I will create my own "WSGI testcases" project on
Google Code to store them all in SVN.)

> If the WSGI specification simply required that EOF be simulated then
> read() with no arguments, or -1 argument, could mean return 
> all remaining content with absolutely no problems. 
> Implementations would also naturally lend themselves to 
> dealing with unconsumed input correctly.

It is too late for WSGI 1.0. The best we can do is say that WSGI
gateways and middleware should implement read() like this, but WSGI
applications and middleware should not depend on it.

> This would subsequently also allow mutating input filters 
> which change the content length, which could then be flagged 
> by setting Content-Length header to -1.

This has to wait until a new version of WSGI. Too many applications are
written with an expectation of a non-negative Content-Length.

> What this still doesn't solve is chunked request content. But 
> then, I don't believe the existing read() method is suitable 
> for that, as what you want with chunked request content, is 
> not return me all input, but return me the next available 
> chunk. As such, some sort of separate abstraction may be 
> required for dealing with chunked request content, using a 
> special argument to read() just isn't going to work.

I agree that a non-blocking variant of read() would be very useful.

> Anyway, in the past, as with many issues it seems people just 
> want to shove this all to be worried about in WSGI 2.0 rather 
> than actually trying to fix all the inconsistencies and sub 
> optimal stuff in WSGI 1.0.

This issue isn't critical like the GET vs. HEAD issue. WSGI applications
can easily work around this issue by simply always supplying a
non-negative size argument to read(). The GET/HEAD issue is so tedious
to work around that it really needs to be addressed in PEP 333.

> All in all I can appreciate the problems some feel in respect 
> of trying to write a true portable WSGI application. If you 
> keep to the core stuff all is okay, start to do complex stuff 
> where the PEP isn't perhaps well defined and you start to run 
> into problems as to what it means and whether it is actually 
> portable. Waiting for WSGI 2.0 isn't really an option since 
> it isn't even going to be interface compatible and frankly 
> may never get done anyway because people will think 1.0 is 
> good enough even if it is not as good as it could be.

The main problems I have run into are the GET/HEAD issue, and problems
with gateways that cannot handle applications that do not read (enough
of the) the request body. These are both issues where the example CGI
WSGI gateway in the PEP is inadequate, and the inadequacy of the example
gateway has spread to other implementations that have overlooked the
same issues. 

It does seem like there is a lot of resistance to modifying PEP 333,
even though it is just a draft. There are a lot of benefits to having a
feature freeze for WSGI 1.0. But, it is also advantageous to remove any
ambiguities in the PEP. In particular, I don't see any disadvantages to
adding a statement that the behavior of read() is only well defined when
a nonnegative size argument is supplied.

- Brian