Hello,
 
I'm not sure if this is the right list, or maybe twisted-web is more appropriate.
 
Also, I might ask a question which has already been answered many times, but I was unable to find references.
 
Short version:
I have 10.000 - 100.000 web browsers that are connected to my site, and I need to inform them __real-time__ (a max of 3-5 seconds delay) of an event that happened on the server. Is twisted the right way to go, given the fact that it promises asynchronous event handling ?
 
Long version:
I have an information flux on a web page, that must change, as stated before, on some specific event that happens on a server.
I have thought of two ways of doing this:
1. The "ask every 5 seconds approach"
Pretty obvious, the browser connects every 5 seconds and requests the page again. However, for 10.000 clients, the server soon dies, and the 5 seconds limit is still not respected (because times of response get incredibly long when apache is submerged in requests).
 
2. The "ask and wait for answer approach"
The basic idea is the following:
- the browser connects to the web page
- there is a javascript snippet in the page that reconnects in the background (using the javascript HTTPRequest object) to a special script on the server.
- the server keeps the connection open (by sending spaces, literally, once every 10-15 seconds - and sleeping in between, not to put too much stress on the server either). When an event happens, the server sends all the needed data to the client, that redisplays it (through javascript).
Of course, there is the problem with apache and it's 5 minutes script running limit (I have implemented this in PHP), but the javascript code is pretty smart to handle this, and when a connection fails, it reconnects and all goes well.
 
This was a little better than the first approach, at least in the response times, that are now consistent with the requirements. However, a new problem arrises: apache cannot handle a very large number of open connections at the same time (every web browser has at least an open connection, in this case). After my calculations (it's pretty hard to compute exactly, as I know of no javascript-enabled crawler that I can programmatically use), the server will be completely trashed at around 300 connections.
 
The problem gets even more complicated with today's browsers: they have a limitation of 2 concurent connections to the same site (don't know if you noticed, but you cannot download 3 files concurently from the same site). And the HTTPRequest connections count toward this limit. So if a client uses two of my information fluxes, he will be unable to visit the site at the same time.
Don't know if twisted solves this last problem. If not, I'll try to find a work around (messing around with the DNS seems like a good idea at this point in time).
 
The question is if twisted can solve my problem of informing all my clients of the event (the event will not happen concurently for all the clients, so there is no problem with server load; however, all the clients will be listening concurently for their specific event)
 
Thanks for any answer, or for any direction/pointers you can give me. I might be totally wrong in my approach, so I'm really open to all suggestions (except buying lots of servers to make this work, of course).
 
Tiberiu DONDERA