Response hanging after headers sent
Hello, I've been trying to implement a simple notification system via long polling using Twisted 14.0.0. On the client side, a web page makes Ajax calls to an "update" resource, which will deliver the latest status message known to the application. On the server side, a thread feeds a queue with status events and another one polls this queue and sends the latest value to any pending request, trying to keep the request open for 30 seconds, or until a new status event arrives, whichever happens sooner (this is the long polling). Occasionally, one of responses will just hang. "request.finish()" is called properly (I've even checked this with "notifyFinish()"), the response headers are sent, but the response body isn't (according to Wireshark), thereby leaving the browser hanging (since it thinks a valid response has started to come through). I have noticed this behaviour with Twisted 13 and 14, but not with earlier versions. This is also an intermittent problem. I have been able to reproduce it a number of times, but not always, and it doesn't necessarily happen after the same number of requests. It seems to fail more easily on machines with one CPU core, or when the execution is limited to one core, with only one request at a time (for some reason, having concurrent requests from multiple browsers or tabs, or being able to use multiple CPUs on the server seems to make the problem harder to reproduce). I'm pasting at the end of this message a simple example where the status events are coming from a counter running in a separate thread. When limiting the server process to one CPU, the counter visible on the webpage rarely seems to go much above 2000. I've only tried under Linux. I've tried this on an actual PC (limiting to one core using "taskset -c 0 python test_server.py"), a VM running in VirtualBox (set to 1 CPU) and even a Raspberry Pi. The Raspberry Pi is where the problem seems to be the easiest to reproduce, but this also happens with the actual PC and virtual machine. This was also only tried on a LAN (considering that, for the purpose of this test case, the notifications are rather frequent). I've tried with the default chunked encoding, or by setting the content-length explicitly, but it didn't make a difference. Is there a way to force to the response to be flushed, after a call to "request.finish()"? Any other suggestions to fix this would also be appreciated. Thank you, Bruno. Only 2 files are required for this test case: counter.html, test_server.py. I've run this in a virtual environment, with Python 2.7 (in which I've installed Twisted with `pip install twisted==14.0.0`). On a multicore machine, to limit the execution to one core, call the server with taskset: "taskset -c 0 python test_server.py". Once the server is running, point a browser (tried with Chrome and Firefox) to "http://the.ip.address.example:8880/", you should see a counter increasing (and eventually stopping, which illustrates the problem). The issue seems to happen more often when only 1 client (or tab) is connected to the server. - counter.html: <html> <head> <script type="application/javascript" src="// code.jquery.com/jquery-1.11.0.min.js"></script> </head> <body> <div id="counter" style="font-size: 100pt; font-weight: bold;"></div> <script type="application/javascript"> // <![CDATA[ $(document).ready(function() { var eTag = null; var normalRefreshDelay = 50; var refreshDelay = normalRefreshDelay; var maxRefreshDelay = 5000; function refresh() { var headers = {}; if (eTag) { headers['If-None-Match'] = eTag; } $.ajax({ url: "/update", headers: headers, dataType: "json", cache: false, success: function(data, status, xhr) { try { if (data[1] != null) { $("#counter").text("Counter: "+data[1]); } eTag = xhr.getResponseHeader("ETag") || null; refreshDelay = normalRefreshDelay; } catch (e) { refreshDelay = Math.min(refreshDelay * 2, maxRefreshDelay); } }, error: function(xhr, text, error) { refreshDelay = Math.min(refreshDelay * 2, maxRefreshDelay); }, complete: function(xhr, textStatus) { setTimeout(refresh, refreshDelay); } }); } refresh(); }); // ]]> </script> </body> </html> - test_server.py # -*- coding: utf-8 -*- import threading import time import Queue import uuid import socket import json from twisted.web.server import Site from twisted.web.resource import Resource from twisted.web.server import NOT_DONE_YET from twisted.internet import reactor from twisted.web.static import File q = Queue.Queue() def fake_counter(): i = 1 while True: q.put({ 1: str(i) }) time.sleep(0.15) i += 1 producer_thread = threading.Thread(target=fake_counter) producer_thread.daemon = True producer_thread.start() class CounterUpdateResource(Resource): def __init__(self, msg_queue): Resource.__init__(self) self.msg_queue = msg_queue self.msg = {} self.etag = None self.pending_requests = [] t = threading.Thread(target=self._poll_messages) t.daemon = True t.start() def _poll_messages(self): while True: try: self.msg = self.msg_queue.get(True, 30) self.etag = '"%s"' % (uuid.uuid4(),) print "-> Got new message: %s" % (self.msg,) except Queue.Empty: pass json_str = json.dumps(self.msg) json_len = len(json_str) while True: try: request = self.pending_requests.pop() self._actual_render(request, json_str, json_len) except IndexError: break def _actual_render(self, request, json_msg, json_len): try: request.setETag(self.etag) request.setHeader("Content-Type", "application/json") request.setHeader("Content-Length", json_len) request.write(json_msg) request.finish() except: pass def render_GET(self, request): ifNoneMatchHeader = request.getHeader("If-None-Match") if ifNoneMatchHeader is None or ifNoneMatchHeader != self.etag: if self.etag: request.setETag(self.etag) request.setHeader("Content-Type", "application/json") data = json.dumps(self.msg) request.setHeader("Content-Length", len(data)) return data else: self.pending_requests.append(request) return NOT_DONE_YET root = Resource() root.putChild("", File("counter.html")) root.putChild("update", CounterUpdateResource(q)) factory = Site(root) reactor.listenTCP(8880, factory) reactor.run() - Sample Wireshark output: Here is a sample output using Wireshark: GET /update?_=1410440411047 HTTP/1.1 Host: 192.168.0.8:8880 Connection: keep-alive Cache-Control: max-age=0 Accept: application/json, text/javascript, */*; q=0.01 X-Requested-With: XMLHttpRequest User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.103 Safari/537.36 If-None-Match: "84c25895-de1e-4b50-954b-438be9d8d9b7" Referer: http://192.168.0.8:8880/ Accept-Encoding: gzip,deflate,sdch Accept-Language: en-GB,en;q=0.8,en-US;q=0.6 HTTP/1.1 200 OK Date: Thu, 11 Sep 2014 13:06:23 GMT Content-Length: 12 ETag: "7736231b-ee3a-4a9f-a93e-ced0994df098" Content-Type: application/json Server: TwistedWeb/14.0.0 {"1": "371"} GET /update?_=1410440411048 HTTP/1.1 Host: 192.168.0.8:8880 Connection: keep-alive Cache-Control: max-age=0 Accept: application/json, text/javascript, */*; q=0.01 X-Requested-With: XMLHttpRequest User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.103 Safari/537.36 If-None-Match: "7736231b-ee3a-4a9f-a93e-ced0994df098" Referer: http://192.168.0.8:8880/ Accept-Encoding: gzip,deflate,sdch Accept-Language: en-GB,en;q=0.8,en-US;q=0.6 HTTP/1.1 200 OK Date: Thu, 11 Sep 2014 13:06:23 GMT Content-Length: 12 ETag: "ddd10400-3530-4cb4-a9cd-0b123032d8af" Content-Type: application/json Server: TwistedWeb/14.0.0 (I've inserted a coupe of line returns after "{"1": "371"}" for readability.) Here, nothing is sent on the wire after the headers of the last response (the "\r\n\r\n" are present).
On 02:27 pm, bruno@distributedmatter.net wrote:
Hello,
I've been trying to implement a simple notification system via long polling using Twisted 14.0.0. On the client side, a web page makes Ajax calls to an "update" resource, which will deliver the latest status message known to the application. On the server side, a thread feeds a queue with status events and another one polls this queue and sends the latest value to any pending request, trying to keep the request open for 30 seconds, or until a new status event arrives, whichever happens sooner (this is the long polling).
[snip]
class CounterUpdateResource(Resource): def __init__(self, msg_queue): Resource.__init__(self) self.msg_queue = msg_queue self.msg = {} self.etag = None self.pending_requests = []
t = threading.Thread(target=self._poll_messages) t.daemon = True t.start()
def _poll_messages(self): while True: try: self.msg = self.msg_queue.get(True, 30) self.etag = '"%s"' % (uuid.uuid4(),) print "-> Got new message: %s" % (self.msg,) except Queue.Empty: pass json_str = json.dumps(self.msg) json_len = len(json_str) while True: try: request = self.pending_requests.pop() self._actual_render(request, json_str, json_len) except IndexError: break
def _actual_render(self, request, json_msg, json_len): try: request.setETag(self.etag) request.setHeader("Content-Type", "application/json") request.setHeader("Content-Length", json_len) request.write(json_msg) request.finish() except: pass
`_poll_messages` is run in a thread. It calls `_actual_render` - so `_actual_render` is run in a thread. It calls these `request` methods - so they are run in a thread. Twisted APIs are not thread-safe - except for one or two that are explicitly marked as being thread-safe. Since this code calls Twisted APIs (the request methods) in a thread, they have undefined behavior. If you change this so that `_actual_render` (or all of the request methods) are called in the reactor thread then I expect it will start to work reliably. Stepping back, you might consider getting rid of the multithreaded in- process polling and instead make your application fully event-driven. Instead of having one part of your program put messages into a queue and another part look at the queue from time to time and do work if it finds an item, have one part of your program call a method implemented by another part of your program which does the work. No threads, no polling, hooray. Stepping back even further, there are existing libraries for doing this kind of thing with Twisted already. There's Divmod Nevow's Athena and there's Minerva. There are also other approaches to "real time" web - such as websockets (see Autobahn and txwebsocket). You may not actually have to implement this low-level plumbing yourself. Jean-Paul
participants (2)
-
Bruno Harbulot
-
exarkun@twistedmatrix.com