[Twisted-Python] data dispatch on massive connection counts

I am testing a WebSocket based pubsub system .. I have 2 questions .. any hints welcome! Environment: + FreeBSD 8.2 i386 + Python 2.7.2 32-bit + Twisted Trunk + new kqueue() reactor The FreeBSD is tuned for massive connection numbers. I can connect 50k WS clients. Fine. 1) Massive dispatching Essentially, what I'm currently doing is: recvs = set([100k instances of protocol.Protocol]) data = "..." for recv in recvs: recv.transport.write(data) Now, writing mass-data to _one_ transport should be done using producer/consumer. But in my case, the data itself is tiny (<100 octets) and the same for all clients, but the number of transports to dispatch that data to is massive. The problem is: while above loop is running, other stuff is being delayed. What would be the right approach to solve that? One idea is to split the loop into 1k chunks and use reactor.callLater to have the sending called again until all recvs are served. Reentry of the reactor via reactor.callLater should make other stuff run in-between, right? Can I use callLater(0, ..) .. that is no delay at all? And is this approach recommended anyway? 2) Too many files. As said, the FreeBSD is tuned for massive connections and I can connect 50k clients. However, the Twisted application not only contains the WebSockets stuff, but also a Twisted Web based web server. Somewhere above 30k connections, I'm beginning to see: twisted/web/server.py, line 132 in process ... twisted/python/filepath.py, line 643 in open <type 'exceptions.IOError'>: [Errno 24] Too many open files However: [autobahn@autobahnhub ~/AutobahnHub/service]$ sysctl kern.maxfiles kern.maxfiles: 204800 [autobahn@autobahnhub ~/AutobahnHub/service]$ sysctl kern.maxfilesperproc kern.maxfilesperproc: 200000 [autobahn@autobahnhub ~/AutobahnHub/service]$ sysctl kern.ipc.maxsockets kern.ipc.maxsockets: 204800 [autobahn@autobahnhub ~/AutobahnHub/service]$ ulimit unlimited It should go well beyond 50k. Doing an lsof on the app PID, I can see the 50k connected TCPs and <100 open files. Why is it denying opening more files? Is there another limit specifically for files, and/or something tunable in Python/Twisted?

On Mon, 2011-11-14 at 04:07 -0800, Tobias Oberstein wrote:
Besides callLater, a higher level construct to do this is twisted.internet.task.cooperate(). https://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2010-cooperative-multit... is a presentation dreid did, which should really be turned into a howto.
2) Too many files. Is there another limit specifically for files, and/or something tunable in Python/Twisted?
That's definitely not a limit on Python or Twisted. That's an OS limit. Are you opening lots of files in addition to your sockets?

Interesting. Need to add that to my toolbox. In the meantime I've implemented the chunked/callLater approach. It works. However, as I currently do it, it breaks message order guarantees .. i.e. it's no longer guaranteed that each recepient will receive published messages in the order a _single_ publisher sent them. Currently I use a set() and pop() from it. I might change that to fetch receivers from deque(sorted(set)) Is it guaranteed that callLater(0, fun1) callLater(0, fun2) will always result in fun2 being called after fun1? Because then, and when alway using same chunk size (# of receivers), above ordering guarantee would hold. I'd like to avoid having to create a send queue per receiver ..
No. After i.e. 50k clients have connected: [autobahn@autobahnhub ~/AutobahnHub/service]$ sysctl kern.openfiles kern.openfiles: 50115 [autobahn@autobahnhub ~/AutobahnHub/service]$ lsof -p 1888 | wc -l 50075 This bugs me .. don't know why it's happening.

Python open() will bail out at 32768 open files. This despite the fact that a) resource.getrlimit(resource.RLIMIT_NOFILE) = 200k b) I can accept 50k sockets Obviously, it's at least not Twisted related .. ========= import resource max = resource.getrlimit(resource.RLIMIT_NOFILE) cnt = 0 print "maximum FDs", max fds = [] while cnt < max: f = open("/tmp/test_%d" % cnt, "w") f.write("test") fds.append(f) cnt += 1 if cnt % 1000 == 0: print "opened %d files" % cnt print "ok, created %d files" % cnt ========= maximum FDs (200000L, 200000L) opened 1000 files opened 2000 files opened 3000 files opened 4000 files opened 5000 files opened 6000 files opened 7000 files opened 8000 files opened 9000 files opened 10000 files opened 11000 files opened 12000 files opened 13000 files opened 14000 files opened 15000 files opened 16000 files opened 17000 files opened 18000 files opened 19000 files opened 20000 files opened 21000 files opened 22000 files opened 23000 files opened 24000 files opened 25000 files opened 26000 files opened 27000 files opened 28000 files opened 29000 files opened 30000 files opened 31000 files opened 32000 files Traceback (most recent call last): File "/home/autobahn/fdtest.py", line 10, in <module> IOError: [Errno 24] Too many open files: '/tmp/test_32765' ^C [autobahn@autobahnhub ~/AutobahnHub/service/autobahnhub/script]$

It's not there .. neither 2.6 nor 2.7 [autobahn@autobahnhub ~/AutobahnHub/service]$ python Python 2.7.1 (r271:86832, Dec 13 2010, 15:52:15) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd8 Type "help", "copyright", "credits" or "license" for more information.
oberstet@wwwtavendo: ~ $ python Python 2.6.6 (r266:84292, Dec 23 2010, 15:11:37) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd8 Type "help", "copyright", "credits" or "license" for more information.

That's definitely not a limit on Python or Twisted. That's an OS limit. Are you opening lots of files in addition to your sockets?
As it turns out, you are right, and the full answer is most unpleasant: This http://www.freebsd.org/cgi/query-pr.cgi?pr=148581&cat= bug in FreeBSD libc, which is there in i386/amd64 up to and including FreeBSD 9 RC1 and the fact that Python uses fopen() from libc, not open() from Posix results in the situation that you can't have more than 32k FDs done. In my situation, it's like this: the new kqueue reactor will happily accept 50k TCPs .. no problem, since Python isn't involved with fopen() here. But as soon as a Python open() and thus fopen() happens, the new FD would need to be >32k and that does not work, since the braindead libc on FreeBSD defines fileno to be SHORT_MAX. Now I'm running out of options. I was told that the new Python 3 IO system does not use fopen(), however Twisted is not yet there on Python 3, right? There is a backport of that new IO to Python 2.7, but I'm not sure if thats transparent for calls like Python open(). I can't open first i.e. 100 files, since the set of files needed is not fixed in the beginning ..i.e when Twisted does a log file switch. Well, this is all absolutely sad. Now we (nearly) have new kqueue, it does fly, but I can't break above 32k anyway ..

On Mon, 14 Nov 2011 09:57:46 -0800 Tobias Oberstein <tobias.oberstein@tavendo.de> wrote:
If you are wild, you can try https://bitbucket.org/pitrou/t3k/ (at least as an experiment :-))
There is a backport of that new IO to Python 2.7, but I'm not sure if thats transparent for calls like Python open().
io.open() should work indeed. However, open() still uses the old I/O routines (and therefore fopen()) for compatibility. If you don't have control on the open() call, you could still try to monkeypatch the module doing the open() call: somemodule.open = io.open cheers Antoine.

On Mon, 2011-11-14 at 04:07 -0800, Tobias Oberstein wrote:
Besides callLater, a higher level construct to do this is twisted.internet.task.cooperate(). https://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2010-cooperative-multit... is a presentation dreid did, which should really be turned into a howto.
2) Too many files. Is there another limit specifically for files, and/or something tunable in Python/Twisted?
That's definitely not a limit on Python or Twisted. That's an OS limit. Are you opening lots of files in addition to your sockets?

Interesting. Need to add that to my toolbox. In the meantime I've implemented the chunked/callLater approach. It works. However, as I currently do it, it breaks message order guarantees .. i.e. it's no longer guaranteed that each recepient will receive published messages in the order a _single_ publisher sent them. Currently I use a set() and pop() from it. I might change that to fetch receivers from deque(sorted(set)) Is it guaranteed that callLater(0, fun1) callLater(0, fun2) will always result in fun2 being called after fun1? Because then, and when alway using same chunk size (# of receivers), above ordering guarantee would hold. I'd like to avoid having to create a send queue per receiver ..
No. After i.e. 50k clients have connected: [autobahn@autobahnhub ~/AutobahnHub/service]$ sysctl kern.openfiles kern.openfiles: 50115 [autobahn@autobahnhub ~/AutobahnHub/service]$ lsof -p 1888 | wc -l 50075 This bugs me .. don't know why it's happening.

Python open() will bail out at 32768 open files. This despite the fact that a) resource.getrlimit(resource.RLIMIT_NOFILE) = 200k b) I can accept 50k sockets Obviously, it's at least not Twisted related .. ========= import resource max = resource.getrlimit(resource.RLIMIT_NOFILE) cnt = 0 print "maximum FDs", max fds = [] while cnt < max: f = open("/tmp/test_%d" % cnt, "w") f.write("test") fds.append(f) cnt += 1 if cnt % 1000 == 0: print "opened %d files" % cnt print "ok, created %d files" % cnt ========= maximum FDs (200000L, 200000L) opened 1000 files opened 2000 files opened 3000 files opened 4000 files opened 5000 files opened 6000 files opened 7000 files opened 8000 files opened 9000 files opened 10000 files opened 11000 files opened 12000 files opened 13000 files opened 14000 files opened 15000 files opened 16000 files opened 17000 files opened 18000 files opened 19000 files opened 20000 files opened 21000 files opened 22000 files opened 23000 files opened 24000 files opened 25000 files opened 26000 files opened 27000 files opened 28000 files opened 29000 files opened 30000 files opened 31000 files opened 32000 files Traceback (most recent call last): File "/home/autobahn/fdtest.py", line 10, in <module> IOError: [Errno 24] Too many open files: '/tmp/test_32765' ^C [autobahn@autobahnhub ~/AutobahnHub/service/autobahnhub/script]$

It's not there .. neither 2.6 nor 2.7 [autobahn@autobahnhub ~/AutobahnHub/service]$ python Python 2.7.1 (r271:86832, Dec 13 2010, 15:52:15) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd8 Type "help", "copyright", "credits" or "license" for more information.
oberstet@wwwtavendo: ~ $ python Python 2.6.6 (r266:84292, Dec 23 2010, 15:11:37) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd8 Type "help", "copyright", "credits" or "license" for more information.

That's definitely not a limit on Python or Twisted. That's an OS limit. Are you opening lots of files in addition to your sockets?
As it turns out, you are right, and the full answer is most unpleasant: This http://www.freebsd.org/cgi/query-pr.cgi?pr=148581&cat= bug in FreeBSD libc, which is there in i386/amd64 up to and including FreeBSD 9 RC1 and the fact that Python uses fopen() from libc, not open() from Posix results in the situation that you can't have more than 32k FDs done. In my situation, it's like this: the new kqueue reactor will happily accept 50k TCPs .. no problem, since Python isn't involved with fopen() here. But as soon as a Python open() and thus fopen() happens, the new FD would need to be >32k and that does not work, since the braindead libc on FreeBSD defines fileno to be SHORT_MAX. Now I'm running out of options. I was told that the new Python 3 IO system does not use fopen(), however Twisted is not yet there on Python 3, right? There is a backport of that new IO to Python 2.7, but I'm not sure if thats transparent for calls like Python open(). I can't open first i.e. 100 files, since the set of files needed is not fixed in the beginning ..i.e when Twisted does a log file switch. Well, this is all absolutely sad. Now we (nearly) have new kqueue, it does fly, but I can't break above 32k anyway ..

On Mon, 14 Nov 2011 09:57:46 -0800 Tobias Oberstein <tobias.oberstein@tavendo.de> wrote:
If you are wild, you can try https://bitbucket.org/pitrou/t3k/ (at least as an experiment :-))
There is a backport of that new IO to Python 2.7, but I'm not sure if thats transparent for calls like Python open().
io.open() should work indeed. However, open() still uses the old I/O routines (and therefore fopen()) for compatibility. If you don't have control on the open() call, you could still try to monkeypatch the module doing the open() call: somemodule.open = io.open cheers Antoine.
participants (5)
-
Antoine Pitrou
-
Donal McMullan
-
Itamar Turner-Trauring
-
Phil Mayers
-
Tobias Oberstein