2 cents on file objects... WAS: RE: [Web-SIG] Bill's comments on WSGI draft 1.4

Thu Sep 2 16:33:26 CEST 2004

[Michael C. Neel]
 > The WSGI can simple state that the
 > return can be used both as a file object and an iterable (which isn't
 > that a bit redundant, I'll have to check but file objects are iterable
 > correct?)

I spent yesterday discussing this with Phillip, and now that I 
understand his design decision, I think it's the right one.

Having frameworks and *all* middleware components deal with both files 
and iterables is an extra and unnecessary complication.

And under python 2.2+, it's irrelevant anyway, because files *are* 
iterables. A problem only arises on <= 2.1 interpreters, which don't 
support iterators nearly as well as 2.2. And that's only a problem 
because of jython being 2.1 only: a problem I seem determined to make my 
own ;-)

The strength of returning an iterable is that the framework can then 
control *when* the output is generated and sent. This fits perfectly 
with python's greatest strength in the web arena: it's simple and 
powerful mechanisms for event-driven processing.

Robert Oschler asked earlier about the write callable vs. returning an 
iterator. I was going to reply, but Phillip got there before me. I would 
only add the following to his excellent explanation.

1. The write callable is only there to support "push" applications, 
where the application generates output and then pushes it through a 
channel set-up by the server/framework, thus relegating the framework to 
a kind of dumb switchboard. This sort of design is usually used in 
threaded servers, which can present scalability problems.

2. The main focus on iterators is the right one because it not only 
supports "push", as described above, but it also supports "pull", i.e. 
where the framework "pulls" output from the application when the time is 
right. The reason why this is a good thing is because the framework is 
in the best position to know when the client is ready to actually 
receive the output, through the use of events/readiness-notification on 
the client socket. The output is only transiently created when required 
and transmitted immediately to the user (potentially with no copying or 
buffering at all!): you don't have large lumps of output hanging around, 
consuming memory.

If you want to create an architecture that works for both "push" and 
"pull", iterators are the way to go

I do find it interesting that we've had no comments from the Zope or 
Twisted people. Glad to see Medusa people here though :-)

Kind regards,

Alan.

P.S. Phillip, I hope you're not affected by that hurricane! I have 
friends in Tampa who counted themselves lucky to have escaped Charley: 
now here comes another one! It appears on the surface that the frequency 
of hurricanes in the gulf is increasing.