best way to serve wsgi with multiple processes

Graham Dumpleton Graham.Dumpleton at gmail.com
Wed Feb 11 17:55:11 EST 2009


On Feb 12, 9:19 am, Robin <robi... at gmail.com> wrote:
> On Feb 11, 7:59 pm, Graham Dumpleton <Graham.Dumple... at gmail.com>
> wrote:
>
>
>
> > On Feb 11, 8:50 pm, Robin <robi... at gmail.com> wrote:
>
> > > Hi,
>
> > > I am building some computational web services using soaplib. This
> > > creates a WSGI application.
>
> > > However, since some of these services are computationally intensive,
> > > and may be long running, I was looking for a way to use multiple
> > > processes. I thought about using multiprocessing.Process manually in
> > > the service, but I was a bit worried about how that might interact
> > > with a threaded server (I was hoping the thread serving that request
> > > could just wait until the child is finished). Also it would be good to
> > > keep the services as simple as possible so it's easier for people to
> > > write them.
>
> > > I have at the moment the following WSGI structure:
> > > TransLogger(URLMap(URLParser(soaplib objects)))
> > > although presumably, due to the beauty of WSGI, this shouldn't matter.
>
> > > As I've found with all web-related Python stuff, I'm overwhelmed by
> > > the choice and number of alternatives. I've so far been using cherrypy
> > > and ajp-wsgi for my testing, but am aware of Spawning, twisted etc.
> > > What would be the simplest [quickest to setup and fewest details of
> > > the server required - ideally with a simple example] and most reliable
> > > [this will eventually be 'in production' as part of a large scientific
> > > project] way to host this sort of WSGI with a process-per-request
> > > style?
>
> > In this sort of situation one wouldn't normally do the work in the
> > main web server, but have a separarte long running daemon process
> > embedding mini web server that understands XML-RPC. The main web
> > server would then make XML-RPC requests against the backend daemon
> > process, which would use threading and or queueing to handle the
> > requests.
>
> > If the work is indeed long running, the backend process would normally
> > just acknowledge the request and not wait. The web page would return
> > and it would be up to user to then somehow occassionally poll web
> > server, manually or by AJAX, to see how progres is going. That is,
> > further XML-RPC requests from main server to backend daemon process
> > asking about progress.
>
> > I do't believe the suggestions about fastcgi/scgi/ajp/flup or mod_wsgi
> > are really appropriate as you don't want this done in web server
> > processes as then you are at mercy of web server processes being
> > killed or dying when part way through something. Some of these systems
> > will do this if requests take too long. Thus better to offload real
> > work to another process.
>
> Thanks - in this case I am contrained to use SOAP (I am providing SOAP
> services using soaplib so they run as a WSGI app). I choose soaplib
> becuase it seems the simplest way to get soap services running in
> Python (I was hoping to get this setup quickly).
> So I am not really able to get into anything more complex as you
> suggest... I have my nice easy WSGI app soap service, I would just
> like it to run in a process pool to avoid GIL.

You can still use SOAP, you don't have to use XML-RPC, they are after
all just an interprocess communications mechanism.

> Turns out I can do that
> with apache+mod_wsgi and daemon mode, or flup forked server (I would
> probably use ajp - so flup is in a seperate process to apache and
> listens on some local port, and apache proxies to that using the ajp
> protocol). I'm not sure which one is best... for now I'm continuing to
> just develop on cherrypy on my own machine.

In mod_wsgi daemon mode the application is still in a distinct
process. The only dfference is that Apache is acting as the process
supervisor and you do not have to install a separate system such as
supervisord or monit to start up the process and ensure it is
restarted if it crashes, as Apache/mod_wsgi will do that for you. You
also don't need flup when using mod_wsgi as it provides everything.

> I suspect I will use ajp forked flup, since that only requires
> mod_proxy and mod_proxy_ajp which I understand come with standard
> apache and the system administrators will probably be happier with.

The Apache/mod_wsgi approach actually has less dependencies. For it
you only need Apache+mod_wsgi. For AJP you need Apache+flup+monit-or-
supervisord. Just depends on which dependencies you think are easier
to configure and manage. :-)

Graham



More information about the Python-list mailing list