[Web-SIG] thoughts on an iterator

Mon Mar 30 18:42:10 CEST 2009

2009/3/28 Robert Brewer <fumanchu at aminus.org>:
> Hmmmm. Graham brought up chunked requests which I don't think have much
> bearing on this issue--the server/app can't rely on the client-specified
> chunk sizes either way (or you enable a Denial of Service attack). I
> don't see much difference between the file approach and the iterator
> approach, other than moving the read chunk size from the app (or more
> likely, the cgi module) to the server. That may be what kills this
> proposal: cgi.FieldStorage expects a file pointer and I doubt we want to
> either rewrite the entire cgi module to support iterators, or re-package
> the iterator up as a file.

There are some alternate implementations of the cgi POST-parsing
functionality, some of which might be more amenable to using an
iterator.  Or for that matter, none of us have probably read the cgi
module with this in mind.  With a quick look, it'll be slightly tricky
because it uses .readline a lot, but there's just not that much code
involved so it can't be too hard.

For clarity, I think everyone has been discussing an *iterator*, not
an iterable; an iterable would have a lot of unnecessary overhead, but
I've seen both terms used.

I don't agree with Graham's objection, as I think the reason to read
specific-sized chunks is that you don't want to read too much data
into memory at one time.  But the server is free to chunk the iterator
to avoid too much data, and once the strings are in memory the
consumer really isn't any better off reading a smaller chunk than what
is available.

This also means I can stop making up entirely random chunk sizes in
applications.  Applications have no real information to inform this
chunking.  If the string is already in memory, the chunking actually
is counterproductive.

-- 
Ian Bicking  |  http://blog.ianbicking.org