[Web-SIG] question about connection pool, task queue in WSGI

Sat Jul 14 06:07:01 CEST 2012

> On 13 July 2012 07:18, est <electronixtar at gmail.com> wrote:
>> Thanks for the answer. That's very helpful info.
>>
>>>  Only by changing the Django code base from memory. Better off asking
>> on the Django users list.
>>
>> Is my idea was good or bad? (make wsgi handle connection pools, instead of
>> wsgi apps)
>>
>> I read Tarek Ziadé last month's experiement of re-use tcp port by specify
>> socket FDs. It's awesome idea and code btw. I have couple of questions about
>> it:
>>
>> 1. In theory, I presume it's also possible with db connections? (After wsgi
>> hosting worker ended, handle the db connection FD to the next wsgi worker)

Unlikely. HTTP connections are stateless, open database connections
are high unlikely to be stateless with the client likely caching
certain session information.

>> 2. Is the socket FD the same mechanism like nginx? If you upgrade nginx
>> binary, restart nginx, the existing http connection won't break.

I would be very surprised if you could upgrade nginx, perform a
restart and preserve the HTTP listener socket. If you are talking
about some other socket I don't know what you are talking about.

As you can with Apache, you can likely enact a configuration file
change and perform a restart or trigger rereading of the configuration
and it would maintain the HTTP listener socket across the
configuration restart, but an upgrade implies changing the binary and
I know no way that you could easily persist a HTTP listener socket
across to an invocation of a new web server instance using a new
executable. In Apache you certainly cannot do it, and unless nginx has
some magic where the existing nginx execs the new nginx version and
somehow communicates through open socket connections to the new
process, I very much doubt it would as it would be rather messy to do
so.

>> 3. Is my following understanding of wsgi model right?
>>
>> A wsgi worker process runs the wsgi app (like django), multiple requests are
>> handled by the same process, the django views process these requests and
>> returns responses within the same process (possible in fork or threaded way,
>> or even both?). After a defined number of requests the wsgi worker
>> terminates and spawns the next wsgi worker process.

Different WSGI severs would behave differently, especially around
process control, but your model of understand is close enough.

>> Before hacking into a task queue based on pure wsgi code, I want to make
>> sure my view of wsgi is correct. :)

Would still suggest you just use an existing solution.

Graham

>> Please advise! Thanks in advance!
>>
>>
>> On Fri, Jul 13, 2012 at 11:31 AM, Graham Dumpleton
>> <graham.dumpleton at gmail.com> wrote:
>>>
>>> On 12 July 2012 19:50, est <electronixtar at gmail.com> wrote:
>>> > Hi list,
>>> >
>>> > I am running a site with django + uwsgi, I have few questions about how
>>> > WSGI
>>> > works.
>>> >
>>> > 1. Is db connection open/close handled by Django? If it's open/closed
>>> > per
>>> > request,
>>>
>>> Yes it is.
>>>
>>> > can we make a connection pool in wsgi level, then multiple django
>>> > views can share it?
>>>
>>> Only by changing the Django code base from memory. Better off asking
>>> on the Django users list.
>>>
>>> > 2. As a general design consideration, can we execute some task *after*
>>> > the
>>> > response has returned to client? I have some heavy data processing need
>>> > to
>>> > be done after return HttpResponse() in django, the standard way to do
>>> > this
>>> > seems like Celery or other task queue with a broker. It's just too
>>> > heavyweight. Is it possible to do some simple background task in WSGI
>>> > directly?
>>>
>>> Read:
>>>
>>> http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode
>>>
>>> In doing this though, it ties up the request thread and so it would
>>> not be able to handle other requests until your task has finished.
>>>
>>> Creating background threads at the end of a request is not a good idea
>>> unless you do it using a pooling mechanism such that you limit the
>>> number of worker threads for your tasks. Because the process can crash
>>> or be shutdown, you loose the job as only in memory and thus not
>>> persistent.
>>>
>>> Better to use Celery, or if you think that is too heavy weight, have a
>>> look at Redis Queue (RQ) instead.
>>>
>>> Graham
>>
>>