Select weirdness

Mon Apr 23 03:33:22 EDT 2007

In article <462c54cb$0$336$e4fe514c at news.xs4all.nl>,
 Irmen de Jong <irmen.NOSPAM at xs4all.nl> wrote:

> Ron Garret wrote:
> > I don't understand why socketserver calling select should matter.  (And 
> > BTW, there are no calls to select in SocketServer.py.  I'm using 
> > Python2.5.)
> 
> You don't *need* a select at all.

Yes I do, because what I'm really writing is a dispatching proxy that 
has to serve many simultaneous connections.

Here's the full story in case you're interested: We have an application 
that is currently fielded as a cgi.  We have a development server that 
needs to run multiple copies of the application at the same time.  This 
is so that developers can push changes into their private "sandboxes" 
for evaluation before going into the main development branch.  Each 
sandbox has its own source tree, its own database, and its own URL 
namespace.

There are a small number of URLs in the application that are performance 
bottlenecks (they are used to serve AJAX updates).  In order to 
alleviate that bottleneck without having to rewrite the whole 
application to run under mod_python or some such thing we've written a 
special dedicated server that handles only the AJAX requests.

The tricky part is that each developer needs to have their own copy of 
this server running because each developer can have different code that 
needs to run to serve those requests.  Assigning each developer a 
dedicated IP port would be a configuration nightmare, so these servers 
serve run on unix sockets rather than TCP sockets.

I have not been able to find a proxy server that can proxy to unix 
sockets, so I need to write my own.  Conceptually its a very simple 
thing: read the first line of an HTTP request, parse it with a regexp to 
extract the sandbox name, connect to the appropriate unix server socket, 
and then bidirectionally pipe bytes back and forth.  But it has to do 
this for multiple connections simultaneously, which is why I need select.

> >> Anyway, try the following instead:
> >>
> > 
> > That won't work for POST requests.
> >
> 
> Why not?

Because POST requests can be very complicated.

> Just add some more code to deal with the POST request body.

I was really hoping to avoid having to write a fully HTTP-aware proxy.

> There should be a content-length header to tell you how many
> bytes to read after the header section has finished.

Not if the content-transfer-encoding is chunked.  Or if there are 
multiple file attachments.

Also, even GET requests can become very complicated (from a protocol 
point of view) in HTTP 1.1.

rg