[Web-SIG] Web Container Interface

Tue Jan 27 23:39:22 EST 2004

At 09:35 PM 1/27/04 -0500, Greg Ward wrote:

>First of all, the absolute #1 best thing about the Java Servlet API is
>that it provides a complete but simple object-oriented wrapper for HTTP
>request-processing in the form of the HttpServletRequest and
>HttpServletResponse classes.  (I can't say offhand if the wrapping is
>100% perfectly complete, but I can say that it provides clean, simple
>access to every feature of HTTP I need in my day-to-day work.)

Unfortunately, this is also 100% out of scope for the interface, because 
every framework out there already has its own request and response 
types.  If Python had this from the mythical "day one", we'd have had a 
chance, but alas it's far too late for that.

>OTOH, the worst thing about the Java Servlet API is the notion of a
>servlet.  There are two problems here:
>
>   * premature overgeneralization; it looks like the servlet API was
>     designed to allow people to someday write servlets for FTP
>     servers or other as-yet-unknown protocols.  This is stupid;
>     web applications use HTTP.  Period.

I'll take that as a +1 for the HTTP-specificity of the existing proposal.  :)

>   * the level of granularity is wrong: most Java web applications
>     consist of multiple servlets, and if the code I work on in my
>     day job is any indication, there's a lot of overlapping code
>     among the servlets in a given application.  Thus, the point
>     of entry between a web application container and a collection
>     of web applications should be... the web application.
>
>     (The Java community has figured this out; when you administer a
>     modern servlet container like Tomcat, you generally work at the
>     level of web apps, rather than individual servlets or the whole
>     container.  The existence of "servlets" as a separate entity
>     complicates both administering a servlet container and writing web
>     applications.  It's a nasty design flaw that we should strenuously
>     avoid.)

I'll take this, in conjunction with some of your later comments below, as a 
vote in favor of retaining "application" as the name for the thing that a 
gateway calls 'runCGI' on.  :)

>The other thing that bugs me about the Java world is that their web
>application containers -- Tomcat in particular, since that's the one I
>use everyday -- are enormously complex, bloated beasts.  They're hard to
>understand, hard to setup, and hard to administer.  They keep thousands
>of people employed at banging their heads against confusing, arcane XML
>config files.  (Come to think of it, the same could be said of Java web
>development frameworks.)
>
>My gut feeling is that a barebones web container -- say, one that
>enables Quixote applications to run as FastCGI scripts, say -- should
>fit into 10 lines of Python code.  A super-duper, whiz-bang,
>all-singing, all-dancing container -- enable applications written under
>N different frameworks to execute using M different models -- should fit
>in roughly 1000 lines of Python.

All the containers I've written so far weigh in at a lot less than 100 
lines; even the BaseHTTPServer one was only maybe 200.  I've only tested 
for N=3 and and M=3 so far, though.  (Three frameworks: Zope 2, Zope 3, 
plain CGI; Three models: plain CGI, FastCGI, and BaseHTTPServer.)

>One big challenge I can foresee: the Python community will never allow a
>standard web container interface to mandate a particular execution
>model, as the Java Servlet API does.  Writing a single API that handles
>both Twisted/Medusa-style (event-driven I/O) and Java-style (threaded
>I/O) will be difficult; it might be impossible.

The standard way in both Zope and Twisted to deal with this is to run 
blocking applications in a thread, allocated from a thread pool, while the 
event dispatch loop runs in the "main" thread.  So, both frameworks already 
offer ready-made APIs for this sort of thing.  In other words, it's not 
impossible, and though it might be difficult, the work has already been 
done in some major frameworks that have event-driven I/O loops.

>   (Hmmm, maybe there is a
>third model: traditional Unix-style (multiprocess I/O).)  I would rather
>see two (three?) related APIs than one really complicated API that tries
>to cover all the bases.

Hm, maybe I actually should bump the number of models I listed 
above.  :)  My "millions of pages/month" app uses a preforking process 
model of serving FastCGI.  It wraps my existing FastCGI container that uses 
-- you guessed it -- runCGI().

Oh, and it uses event-driven I/O loops to communicate between the parent 
and the subprocesses, as well as to monitor the FastCGI socket...

I'm saying all this not to brag about my "mad skillz", but to point out 
that I wrote the 'runCGI' proposal to cover *actual* container 
implementations that I had already used in a variety of process models (and 
at least two protocols) in production environments.  It is not a 
theoretical proposal, but a report on actual use experience.

>Finally, in reponse to a later remark by Philip (I think): I definitely
>like calling the things that web developers write "web applications".
>"Web service" implies to me a special case of web application that does
>not have a human user interface.  And I'm perfectly comfortable calling
>the software that runs web applications an "application container".
>"Application engine" and "application server" also make sense to me.
>Whatever terminology we pick, it should be carefully defined in that
>PEP!

Ian's comments made it appear to me that "application" was too vague and 
potentially prone to misunderstandings.  "Service" seemed to eliminate some 
of those.  "Servlet" is another possibility, but of course it would carry 
some inaccurate connotations from Java.  Certainly, I'm open to other 
suggestions, but I'd prefer something that starts with an 'S' now that I've 
gone and written a 'WSGIServer' module...  ;)