[Web-SIG] Web Container Interface

Thu Jan 29 12:51:05 EST 2004

At 11:13 PM 1/28/04 -0600, Ian Bicking wrote:
>So what alternative do you propose for handling a shutdown?  The 
>application *needs* to know about this.  I don't think I trust atexit 
>(though I'm open to it if it really would work).  Also, if the application 
>spawns threads that will simply block shutdown unless they can be told to stop.

What do you propose to use instead of atexit?

Suppose we add a 'shutdown()' method, analagous to Java servlets' 
'destroy()'.  How is the *container* going to guarantee it'll get 
called?  If we define it as "best effort", then the application writer who 
wants a guarantee is *still* going to have to use atexit, or something 
else.  Or else we're going to force the container to use atexit, whether 
the service needs a shutdown message or not, and bloat both the container 
and the number of atexit functions registered, while duplicating this 
functionality in every container!

I haven't used a framework or written an application that needed an 
explicit shutdown in order to operate properly.  However, if one is needed, 
that's what 'atexit' is for, and it has one of the stronger cleanup 
guarantees of anything in Python that I know of!

So, here's what I would suggest...  if we want to allow containers to start 
and shutdown servlets at runtime, we can add a 'shutdown()' method.  BUT, I 
don't want to *require* the container to call it.  If the servlet wants a 
guaranteed shutdown, it *must* use atexit or some other finalization strategy.

By the way, in reference to shutdown being blocked by threads, AFAIK your 
statement only applies to use of the 'threading' module with "non-daemonic" 
threads.  (And that blocking is done with an atexit function.)

>I don't care about blurring of responsibilities nearly as much as 
>utility.  You want to convert other frameworks, well then you have to 
>convert their functionality.  Spawned threads in the application 
>exist.  Resources (threads included) that have to be explicitly cleaned up 
>on shutdown exist.

It might be helpful to read some of Guido and Tim Peters' comments about 
these things on Python-Dev.  They've tended to be very much of the opinion 
that Python doesn't guarantee resource finalization, period, and that it's 
the OS's job to reclaim resources on process termination.  I found some of 
the recent discussion of the Python 2.3.2 finalization GC bugs to be quite 
enlightening on just how *hard* it is to guarantee finalization of anything.

Anyway, as I said, if an app creates a non-daemonic thread, presumably it 
*wants* for shutdown to wait for it, and if it wants to know that shutdown 
is happening, there's atexit.

IOW, there are perfectly good mechanisms in the stdlib for dealing with 
these things, and I don't see a reason to either reinvent them, or force 
container authors to do the application authors' job.

>I'm not trying to be difficult, I'm just trying to envision how I would 
>adapt Webware's gateway and application (AppServer and Application) to 
>this interface.  I don't think of Webware as being particularly 
>featureful, so I'm surprised other people haven't seen these problems 
>either, unless there are solutions that I'm missing.

Well, I'm so far having trouble understanding the specific things you're 
trying to do, that aren't addressed by stdlib features.  I am still open to 
addressing whatever they might be, I just want concrete use cases and 
narrow solutions.  IOW, I'm YAGNI on widening the interface, and saying 
"show me the use cases".

>The CGI protocol already passes through gateway information, in 
>SERVER_SOFTWARE.  The client passes through information in User-Agent.
>User-Agent is already used heavily, and Webware does use SERVER_SOFTWARE 
>for a couple of things (when IIS acts differently from Apache).  It might 
>not be clean, but it gets the job done.  I know we use os.name a 
>lot.  This is information applications need, and it is against convention 
>to hide that information.

Those pieces of information have established standards and conventions to 
guide their use.  However, even those very same items are subject to 
rampant abuse, such as by sites that refuse to let you use them unless you 
pretend to be MSIE.  Thus, in the absence of a specific use case for having 
the information, I'd like to avoid its presence.  In the presence of a 
specific use case, I'll want to find the change to the spec that makes the 
least possible increase in container-to-app guarantees.

>>>Truisms, I say!  Anyway, it's not about guessing.  It's about 
>>>hard-coding behavior based on the environment, when it's called for to 
>>>solve demonstrable problems.  You don't get OS-independent programs by 
>>>hiding the operating system from the language (though people have 
>>>tried).  And I don't think you get gateway-independent applications by 
>>>hiding the gateway.
>>
>>That's what configuration is for.  The deployer/integrator should be 
>>allowed to control the app's behavior.
>
>The best documentation is when no documentation is needed.  That's my 
>truism ;)  When we figure something out, I'd rather put that knowledge 
>into code, instead of documentation.  And every piece of configuration 
>requires documentation (and it doesn't even save you any code).

Clearly, we disagree on this issue.  You want a wide interface, I want a 
narrow one.  One reason is that I want to encourage proliferation of 
containers.  We already have a huge proliferation of apps and frameworks, 
and very few choices for how to run and deploy them.  My practical 
observation has been that when identification of a host environment is 
permitted -- as opposed to introspection of host *properties* -- it rapidly 
leads to nonportable code, where portable is defined as "will run correctly 
in a *new* environment without reprogramming".

I have no objection to defining properties like "container is LRSP-multi" 
or whatever, if there is a meaningful use case.  What I object to is simply 
throwing random spoor for the servlet to sniff at and guess its prey.

So please, let's focus on what specific properties you'd like to know about 
a container, if any.

>Now I'm confused.  If it's a single process, and handles only one request, 
>isn't that just broken?  I don't know of any example of such a server, 
>since it wouldn't be able to handle concurrent requests.

Do you need concurrent requests for a single-user "webtop" application that 
runs on your desktop?  The fact that the model has a limited usage profile 
doesn't make it broken.  Perhaps somebody will also speak up in favor of 
"multiprocess+multithread" model that I previously mentioned as making my 
head hurt.

>>LRSP-single means only one thread.  LRSP-multi means multi-threaded.
>>Asynchronousness actually implies LRSP-multi, because if you're doing an 
>>asynchronous event loop the only way you can afford to call a blocking 
>>'runCGI()' is to do it in a thread.  Twisted and ZServer are asynchronous 
>>LRSP-multi.
>
>Okay.  This seems to mean that Twisted wouldn't use this interface 
>internally, since they don't want to unnecessarily spawn a thread, and the 
>interface doesn't seem to allow for a non-blocking API.

Twisted has a builtin "thread pool" mechanism for this.  A WSGI 
implementation for Twisted would simply call (IIRC):

     reactor.callInThread(service.runCGI, stdinWrapper, stdoutWrapper, 
stderrWrapper, environ)

And threads would only be spawned up to the configured pool size.  If there 
are more concurrent requests than allocated threads, the runCGI call will 
be queued until an existing thread finishes.

ZServer has a similar mechanism, although I believe it's more "internal" 
and less available for a third party to do.  Twisted is flexible enough 
that a third party could roll their own Twisted-based WSGI gateway, if the 
core developers aren't interested or want no part of it.  :)

>But BaseHTTPServer without threads is kind of a silly thing, right?
>You can play with it, but not really use it for anything real.

Sure you can: desktop web apps, and most especially, desktop testing and 
development of an app to be later deployed in a "real" container.  I 
specifically wrote WSGIServer for this purpose (and to serve as an example 
of how to make a simple WSGI container in a web server).