[Twisted-Python] I found an interisting comment about Twisted vs. Erlang
![](https://secure.gravatar.com/avatar/45c4c3d016586cd3f4f3adcc3f0c104d.jpg?s=120&d=mm&r=g)
I have been a long time Twisted user, and I do not know Erlang. I run into this interesting comment in engineering notes on Facebook Chat scalability: http://www.facebook.com/note.php?note_id=51412338919&comments Leif K-Brooks I'm curious whether you considered using the Python library Twisted (http://twistedmatrix.com/) before deciding on Erlang. Its architecture is also supposed to be able to support a large number of concurrent connections. I would be interested in hearing about any issues you discovered with it. February 18 at 3:17am · Report Lawrence Oluyede TwistedMatrix does not possess the capabilities of Erlang, not even close. February 18 at 4:35am · Report What do you guys think?
![](https://secure.gravatar.com/avatar/84b6e4ca9a6e15969649b4728224378b.jpg?s=120&d=mm&r=g)
One is a language. The other is an event loop. I'm not sure how we are supposed to compare the two. If he would've said E and Twisted, perhaps it'd be a more interesting comparison :-) Also, what Steve said jumped right into my eye as well. Don't get me wrong. I like Erlang -- it's functional, it's robust, it's very easy to make your programs execute in parallel. RabbitMQ and Scalaris are two examples of *excellent* Erlang software. I don't really think Erlang was a bad choice -- I just think they don't know enough about Twisted to judge :-) Laurens
![](https://secure.gravatar.com/avatar/769326b856fc9c7ef80a425fecd05ac9.jpg?s=120&d=mm&r=g)
On Tue, Sep 29, 2009 at 3:11 PM, Laurens Van Houtven <lvh@laurensvh.be>wrote:
actually one is a Concurrent Virtual Machine and language and Applications Framework (OTP) the other is a single threaded event loop library. Apples and Oranges. I have written production level code at a major ISP using Twisted. It became very painful once we started getting CPU bound on 32 core CPU machines. We had to run multiple instances of our Twisted server implementation to saturate the machine. I have since "ported" the same application to Erlang/OTP. It is about 1/5 the amount of code. And it scales 1:1 horizontally (adding more CPUs local or remote ) and vertically (adding FASTER cpus ) with no code changes. The broad statement that Twisted doesn't possess the capabilities of Erlang/OTP is pretty much accurate. Each solves a different problem. Granted Erlang/OTP is definately a super-set of what Twisted does -- Jarrod Roberson 678.551.2852
![](https://secure.gravatar.com/avatar/45c4c3d016586cd3f4f3adcc3f0c104d.jpg?s=120&d=mm&r=g)
Scaling horizontally, i.e. parallelization of twisted servers over multiple cores is a difficult problem. We run this on 8 and 16 core CPU machines, one process per core, and I use UDP multicasts on internal LAN for messaging between the servers (both inside the same physical machine, or to remote nodes). This is of course less than ideal. It would be very nice if Twisted had some inherent capabilities for scaling horizontally.
vertically (adding FASTER cpus ) with no code changes.
When you ported, were you able to serve to less or more clients on the same speed CPU core? The biggest advantage of Twisted is a shorter development time, and the availability of Python libraries. But I wonder if running something of the scale of Facebook chat would be feasible at all, or if there are people on this list that have run many parallelized twisted servers that saturated many multi-core machines? In other words, are there practical examples of large-scale Twisted deployment?
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Wed, Sep 30, 2009 at 4:00 PM, Alec Matusis <matusis@yahoo.com> wrote:
http://twistedmatrix.com/trac/wiki/SuccessStories#Justin.tv "Justin.tv is the largest live video site on the internet. ... Our Twisted video cluster currently supports thousands of simultaneous broadcasts and serves up to 30M live video streams per day."
![](https://secure.gravatar.com/avatar/d6304567ada7ac5e8c6f4e5902270831.jpg?s=120&d=mm&r=g)
On Tue, Sep 29, 2009 at 2:16 PM, Alec Matusis <matusis@yahoo.com> wrote:
I remember reading that comment and sighing to myself. First of all: 1. Twisted is a framework for event-driven programming 2. Erlang is a programming language / VM So any comparison between the two doesn't make much sense. Secondly, the use of the term "TwistedMatrix" indicates pretty strongly that the poster knows next-to-nothing about Twisted. At surface level the comment is completely unqualified, lacking any detail. <sarcasm>It doesn't "possess the capabilities" of Erlang? So Erlang has some unspecified capabilities (they most be really important ones) that Twisted obviously does not. That's great to know. Thanks for the detail there Mr. Oluyede.</sarcasm> A comparison between Erlang and Python the VMs might be more interesting. First of all, Erlang was developed from the ground up with performance and concurrency in mind, whereas Python was not. The primary design concern of Python, arguably, was to develop a language that was easy to learn with a syntax that lended itself to programs that are easy to read and thus easy to maintain. A fast VM is great, but do you want that at the expense of having less choices for third-party libraries or having to learn an entirely different language paradigm. If you work primarily with OOP, for example, Erlang might be a dramatic shift for your development team - switching to FP. That's not much fun if programming is your livlihood moreso than a hobby. Personally I prefer Python to Erlang accepting the performance trade-off. There's more choices when it comes to open-source libraries for the language and it performs well enough. I'm writing production applications in Python using Twisted and we have 0 issues with performance (memory, CPU, IO). Twisted does handle a large number of connections. The issues we've run into in our application have always been at a layer above Twisted (i.e. our shit that's broken, not Twisted). It performs, it's stable. So ... what are my motiviations for abandoning Python/Twisted in favor or Erlang? Let me just say though that I really like Erlang. I think it's a decent language with simple-enough syntax and a kick-ass concurrency paradigm - shared nothing and message passing. It looks like things are also dramatically improving in the Erlang open-space (and it seems this is improving dramatically: http://erlware.org/erlware/index.html), so it might not be long before there are as many or more library choices for Erlang as Python. Nonethelless, please let first-hand experience using a language, gaining productivity and then buy-in from your development team be the key deciding factors - not some arbitrary number in a benchmark test or even worse a completely uninformed comment on a Facebook dev blog post. -Drew
![](https://secure.gravatar.com/avatar/84b6e4ca9a6e15969649b4728224378b.jpg?s=120&d=mm&r=g)
I think we are accounting way too much time to the unfounded unaccountable random blathering of a troll :-)
![](https://secure.gravatar.com/avatar/45c4c3d016586cd3f4f3adcc3f0c104d.jpg?s=120&d=mm&r=g)
I think we are accounting way too much time to the unfounded unaccountable random blathering of a troll :-)
As you can see from the %CPU column, I have my reasons for concern ;) This is current copy and paste from a node with 2x quad core xeon L5420 @ 2.50GHz - 1 twistd process per core. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24448 nobody 20 0 991m 607m 2452 R 99 4.3 24764:37 twistd 28553 nobody 20 0 909m 453m 2412 R 95 3.2 1346:51 twistd 24640 nobody 20 0 1092m 676m 2452 R 93 4.8 32750:14 twistd 29900 nobody 20 0 802m 362m 2412 R 93 2.6 1180:53 twistd 24279 nobody 20 0 761m 277m 2424 R 42 2.0 13891:58 twistd 32422 nobody 20 0 1381m 962m 2372 R 10 6.9 13241:54 twistd 24210 nobody 20 0 599m 236m 2396 S 4 1.7 9063:29 twistd 29862 nobody 20 0 323m 14m 2384 S 2 0.1 71:53.50 twistd
![](https://secure.gravatar.com/avatar/9ba6ae09ad47f1dd0dce031fa052185a.jpg?s=120&d=mm&r=g)
Hi Alec, On Tue, Sep 29, 2009 at 12:36 PM, Alec Matusis <matusis@yahoo.com> wrote:
What was your application doing at the time? Was it idle, heavily loaded, somewhere in the middle? What is the QoS for each client connecting to your service? Are requests being handled in a timely fashion? Do the users of your service perceive a performance problem or do you just not like seeing big numbers in the %CPU column? Are performance problems really in Twisted, or are they because of suboptimal decisions in application logic/factoring? Posting a list like that and suggesting that something is wrong (or right, even) with Twisted isn't really productive: there's nowhere obvious to go from here to make it better. My first glance at those numbers made me think Twisted is awesome because it lets you write an application that can actually make full use of the CPU power you have, but I'm guessing that's not what you're getting at. :) Thanks, J.
![](https://secure.gravatar.com/avatar/45c4c3d016586cd3f4f3adcc3f0c104d.jpg?s=120&d=mm&r=g)
I am actually not suggesting the performance is bad, it's quite decent. Each process is handling about 30,000 clients, using epoll reactor, and is heavily loaded as you can see from S column, which shows "R". The app broadcasts the messages sent from a client to groups of other clients, and has some other logic. I posted this twice by mistake, but as I wrote in my second post, the debugging is pretty hard, esp memory leaks... Also, performance debugging- say how much time is spent in each function? FB engineers seems to say that debugging of E is transparent, but I do not have the first hand knowledge.
![](https://secure.gravatar.com/avatar/84b6e4ca9a6e15969649b4728224378b.jpg?s=120&d=mm&r=g)
One is a language. The other is an event loop. I'm not sure how we are supposed to compare the two. If he would've said E and Twisted, perhaps it'd be a more interesting comparison :-) Also, what Steve said jumped right into my eye as well. Don't get me wrong. I like Erlang -- it's functional, it's robust, it's very easy to make your programs execute in parallel. RabbitMQ and Scalaris are two examples of *excellent* Erlang software. I don't really think Erlang was a bad choice -- I just think they don't know enough about Twisted to judge :-) Laurens
![](https://secure.gravatar.com/avatar/769326b856fc9c7ef80a425fecd05ac9.jpg?s=120&d=mm&r=g)
On Tue, Sep 29, 2009 at 3:11 PM, Laurens Van Houtven <lvh@laurensvh.be>wrote:
actually one is a Concurrent Virtual Machine and language and Applications Framework (OTP) the other is a single threaded event loop library. Apples and Oranges. I have written production level code at a major ISP using Twisted. It became very painful once we started getting CPU bound on 32 core CPU machines. We had to run multiple instances of our Twisted server implementation to saturate the machine. I have since "ported" the same application to Erlang/OTP. It is about 1/5 the amount of code. And it scales 1:1 horizontally (adding more CPUs local or remote ) and vertically (adding FASTER cpus ) with no code changes. The broad statement that Twisted doesn't possess the capabilities of Erlang/OTP is pretty much accurate. Each solves a different problem. Granted Erlang/OTP is definately a super-set of what Twisted does -- Jarrod Roberson 678.551.2852
![](https://secure.gravatar.com/avatar/45c4c3d016586cd3f4f3adcc3f0c104d.jpg?s=120&d=mm&r=g)
Scaling horizontally, i.e. parallelization of twisted servers over multiple cores is a difficult problem. We run this on 8 and 16 core CPU machines, one process per core, and I use UDP multicasts on internal LAN for messaging between the servers (both inside the same physical machine, or to remote nodes). This is of course less than ideal. It would be very nice if Twisted had some inherent capabilities for scaling horizontally.
vertically (adding FASTER cpus ) with no code changes.
When you ported, were you able to serve to less or more clients on the same speed CPU core? The biggest advantage of Twisted is a shorter development time, and the availability of Python libraries. But I wonder if running something of the scale of Facebook chat would be feasible at all, or if there are people on this list that have run many parallelized twisted servers that saturated many multi-core machines? In other words, are there practical examples of large-scale Twisted deployment?
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Wed, Sep 30, 2009 at 4:00 PM, Alec Matusis <matusis@yahoo.com> wrote:
http://twistedmatrix.com/trac/wiki/SuccessStories#Justin.tv "Justin.tv is the largest live video site on the internet. ... Our Twisted video cluster currently supports thousands of simultaneous broadcasts and serves up to 30M live video streams per day."
![](https://secure.gravatar.com/avatar/d6304567ada7ac5e8c6f4e5902270831.jpg?s=120&d=mm&r=g)
On Tue, Sep 29, 2009 at 2:16 PM, Alec Matusis <matusis@yahoo.com> wrote:
I remember reading that comment and sighing to myself. First of all: 1. Twisted is a framework for event-driven programming 2. Erlang is a programming language / VM So any comparison between the two doesn't make much sense. Secondly, the use of the term "TwistedMatrix" indicates pretty strongly that the poster knows next-to-nothing about Twisted. At surface level the comment is completely unqualified, lacking any detail. <sarcasm>It doesn't "possess the capabilities" of Erlang? So Erlang has some unspecified capabilities (they most be really important ones) that Twisted obviously does not. That's great to know. Thanks for the detail there Mr. Oluyede.</sarcasm> A comparison between Erlang and Python the VMs might be more interesting. First of all, Erlang was developed from the ground up with performance and concurrency in mind, whereas Python was not. The primary design concern of Python, arguably, was to develop a language that was easy to learn with a syntax that lended itself to programs that are easy to read and thus easy to maintain. A fast VM is great, but do you want that at the expense of having less choices for third-party libraries or having to learn an entirely different language paradigm. If you work primarily with OOP, for example, Erlang might be a dramatic shift for your development team - switching to FP. That's not much fun if programming is your livlihood moreso than a hobby. Personally I prefer Python to Erlang accepting the performance trade-off. There's more choices when it comes to open-source libraries for the language and it performs well enough. I'm writing production applications in Python using Twisted and we have 0 issues with performance (memory, CPU, IO). Twisted does handle a large number of connections. The issues we've run into in our application have always been at a layer above Twisted (i.e. our shit that's broken, not Twisted). It performs, it's stable. So ... what are my motiviations for abandoning Python/Twisted in favor or Erlang? Let me just say though that I really like Erlang. I think it's a decent language with simple-enough syntax and a kick-ass concurrency paradigm - shared nothing and message passing. It looks like things are also dramatically improving in the Erlang open-space (and it seems this is improving dramatically: http://erlware.org/erlware/index.html), so it might not be long before there are as many or more library choices for Erlang as Python. Nonethelless, please let first-hand experience using a language, gaining productivity and then buy-in from your development team be the key deciding factors - not some arbitrary number in a benchmark test or even worse a completely uninformed comment on a Facebook dev blog post. -Drew
![](https://secure.gravatar.com/avatar/84b6e4ca9a6e15969649b4728224378b.jpg?s=120&d=mm&r=g)
I think we are accounting way too much time to the unfounded unaccountable random blathering of a troll :-)
![](https://secure.gravatar.com/avatar/45c4c3d016586cd3f4f3adcc3f0c104d.jpg?s=120&d=mm&r=g)
I think we are accounting way too much time to the unfounded unaccountable random blathering of a troll :-)
As you can see from the %CPU column, I have my reasons for concern ;) This is current copy and paste from a node with 2x quad core xeon L5420 @ 2.50GHz - 1 twistd process per core. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24448 nobody 20 0 991m 607m 2452 R 99 4.3 24764:37 twistd 28553 nobody 20 0 909m 453m 2412 R 95 3.2 1346:51 twistd 24640 nobody 20 0 1092m 676m 2452 R 93 4.8 32750:14 twistd 29900 nobody 20 0 802m 362m 2412 R 93 2.6 1180:53 twistd 24279 nobody 20 0 761m 277m 2424 R 42 2.0 13891:58 twistd 32422 nobody 20 0 1381m 962m 2372 R 10 6.9 13241:54 twistd 24210 nobody 20 0 599m 236m 2396 S 4 1.7 9063:29 twistd 29862 nobody 20 0 323m 14m 2384 S 2 0.1 71:53.50 twistd
![](https://secure.gravatar.com/avatar/9ba6ae09ad47f1dd0dce031fa052185a.jpg?s=120&d=mm&r=g)
Hi Alec, On Tue, Sep 29, 2009 at 12:36 PM, Alec Matusis <matusis@yahoo.com> wrote:
What was your application doing at the time? Was it idle, heavily loaded, somewhere in the middle? What is the QoS for each client connecting to your service? Are requests being handled in a timely fashion? Do the users of your service perceive a performance problem or do you just not like seeing big numbers in the %CPU column? Are performance problems really in Twisted, or are they because of suboptimal decisions in application logic/factoring? Posting a list like that and suggesting that something is wrong (or right, even) with Twisted isn't really productive: there's nowhere obvious to go from here to make it better. My first glance at those numbers made me think Twisted is awesome because it lets you write an application that can actually make full use of the CPU power you have, but I'm guessing that's not what you're getting at. :) Thanks, J.
![](https://secure.gravatar.com/avatar/45c4c3d016586cd3f4f3adcc3f0c104d.jpg?s=120&d=mm&r=g)
I am actually not suggesting the performance is bad, it's quite decent. Each process is handling about 30,000 clients, using epoll reactor, and is heavily loaded as you can see from S column, which shows "R". The app broadcasts the messages sent from a client to groups of other clients, and has some other logic. I posted this twice by mistake, but as I wrote in my second post, the debugging is pretty hard, esp memory leaks... Also, performance debugging- say how much time is spent in each function? FB engineers seems to say that debugging of E is transparent, but I do not have the first hand knowledge.
participants (7)
-
Alec Matusis
-
Drew Smathers
-
Glyph Lefkowitz
-
Jamu Kakar
-
Jarrod Roberson
-
Laurens Van Houtven
-
Steve Steiner (listsin)