[Web-SIG] thoughts on an iterator

Mon Mar 30 19:36:21 CEST 2009

Hi all,

It was great to meet (nearly) everybody at PyCon; I look forward to
the next time.

I particularly want to thank Robert for being so meticulous about
recording and reporting the discussions; a necessary part of moving
forward, IMO.

[Robert]
> Hmmmm. Graham brought up chunked requests which I don't think have much
> bearing on this issue--the server/app can't rely on the client-specified
> chunk sizes either way (or you enable a Denial of Service attack). I
> don't see much difference between the file approach and the iterator
> approach, other than moving the read chunk size from the app (or more
> likely, the cgi module) to the server. That may be what kills this
> proposal: cgi.FieldStorage expects a file pointer and I doubt we want to
> either rewrite the entire cgi module to support iterators, or re-package
> the iterator up as a file.

I recommend that any discussion of file-like vs. iterator for input
should be informed by this discussion between myself and PJE back when
the spec was being written.

http://mail.python.org/pipermail/web-sig/2004-September/000885.html

Most relevant quote

[PJE]
> Aha!  There's the problem.  The 'read()' protocol is what's wrong.  If
> 'wsgi.input' were an *iterator* instead of a file-like object, it would be
> fairly straightforward for async servers to implement "would block" reads
> as yielding empty strings.  And, servers could actually support streaming
> input via chunked encoding, because they could just yield blocks once
> they've arrived.
>
> The downside to making 'wsgi.input' an iterator is that you lose control
> over how much data to read at a time: the upstream server or middleware
> determines how much data you get.  But, it's quite possible to make a
> buffering, file-like wrapper over such an iterator, if that's what you
> really need, and your code is synchronous.  (This will slightly increase
> the coding burden for interfacing applications and frameworks that expect
> to have a readable stream for CGI input.)  For asynchronous code, you're
> just going to invoke some sort of callback with each block, and it's the
> callback's job to deal with it.
>
> What does everybody think?  If combined with a "pause iterating me until
> there's input data available" extension API, this would let the input
> stream be non-blocking, and solve the chunked-encoding input issue all in
> one change to the protocol.  Or am I missing something here?

http://mail.python.org/pipermail/web-sig/2004-September/000890.html

I'd also be interested in the Twisted folk's take on that discussion.

All the best,

Alan.