Mailman 3 [Twisted-Python] Multi-reactor architecture - Twisted - python.org

newer
Re: [Twisted-Python]...

[Twisted-Python] Multi-reactor architecture

older
[Twisted-Python]...

vitaly＠synapticvision.com

Nov. 12, 2009

6:50 p.m.

hi, what would be the right thing to start from in order to build multi-reactor arch to handle thousands of concurrent connections? Appreciate the help.

Reply

Sign in to reply online Use email software

Show replies by date

Reza Lotun

November 2009

7:09 p.m.

Why would you want multiple reactors? The only reason would be to have one per CPU core. For that the simplest thing would be to manually start n twistd processes, one per core and have a reverse proxy process listening on a port and distributing connections to each twistd process. Of course, you can extend this architecture to multiple machines. Cheers, Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

7:23 p.m.

How reverse proxy knows to balance between couple of twistd processes? Do you have may be an example(or doc link) of such proxy that can distributing connections to each twistd process? Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

Reza Lotun

7:40 p.m.

This is a classic distributed systems architecture. A reverse proxy can either something like haproxy, nginx, apache, perlbal or whatever (even another twisted process). The twistd processes can be seen as simply other machines on a LAN - instead they all have the 127.0.0.1:<port> where the port is different for each process. I don't have a more comprehensive example, but there are many many examples of reverse proxying servers all over the place - google "nginx reverse proxy" for many examples. Hope that helps. Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

7:53 p.m.

You right, distributed systems architecture. Probably I need to rephrase my question more precisely: how to build distributed system architecture with Twisted technology only ? As you mentioned, even reverse proxy could be a another twisted process. Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

Kevin Horn

8:11 p.m.

On Thu, Nov 12, 2009 at 12:53 PM, <vitaly@synapticvision.com> wrote:

You might look at txLoadBalancer: https://launchpad.net/txloadbalancer Kevin Horn

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

4:59 p.m.

Doesn't the event loop have a limit of connections it could handle? Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

exarkun＠twistedmatrix.com

5:04 p.m.

On 03:59 pm, vitaly@synapticvision.com wrote:

Doesn't the event loop have a limit of connections it could handle?

Multiple reactors isn't a realistic solution to this. The solution is to switch to an event loop that has a higher limit. "The" event loop is actually a choice of many possible event loops. So connection limits aren't a good reason to want multiple reactors. Jean-Paul

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

5:44 p.m.

I've get confused enough already :-)) Once there is a Site that serving many clients and reactor.listenSSL(), for example, that actually serving many TCP connections and all these going through TwistedGateway, my logic, please correct me if I wrong, says at some point there will be a limit on concurrent TCP connections, so how is it solved with Twisted? Quoting exarkun@twistedmatrix.com:

Reply

Sign in to reply online Use email software

Reza Lotun

6:08 p.m.

That's a good question and I'm not sure if there's a definite answer to this (as far as I know). I think it depends on your application - for example, if your server is performing a big computation then on average client connections will last longer, meaning you'll have more concurrent connections. The best way to determine this is to *measure* it - for example, you can do a load test with httperf and ramp up connections until this start to break or become unresponsive. You can mitigate the situation by tuning your platform a bit (assuming you're using linux) - use the epoll reactor, which is high performance - make sure the number of open file descriptors is set to something high (and *not* 1024) - see `ulimit -a` - make sure you tune your tcp settings - see /etc/sysctl.conf, namely fs.file-max and various net.ipv4 settings (google is your friend on the best settings, coupled with testing) Cheers, Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

6:25 p.m.

thank you for such detailed response. I feel, finally I've succeed to express my original question correctly. So if I go one step forward, and lets assume that indeed there is such limit of concurrent connections, THAN: should it be resolved by another architecture or another usage type of Twisted technology or something else? Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

Reza Lotun

6:44 p.m.

Again, I don't think there are any universal answers to this question. It depends on what you're building. For example, say it's a REST api, which by design is stateless (i.e. no sessions). Then you can stick a load balancer in front (if you're on EC2 amazon has an "elastic load balancer" service for this) and load balance amongst many machines. As you find traffic increases you simply add more machines. This is called "horizontal scalability" and, as you might imagine, its highly desirable. Another form is "vertical scalability" - that involves getting a faster computer to run your server on. This might work for some cases, but not in general - it seems to be the method applied to scaling RDBMSs, before going down the road of master/slave setups, sharding and denormalization. Of course, you *could* use a different technology entirely when you need to scale really high. This might make sense if your'e a small company and growing - say you start out as a small team, and you need something up quickly that's fairly decent. You happen to know python so you roll the whole thing out in Twisted. As time progresses, you may rewrite certain systems in, say, erlang or something and move forward. So, it's hard to say, really. At least, I'd like to know myself ;-) That's what makes the wheel field so interesting - there's a certain creative element to scalable systems. Cheers, Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

7:06 p.m.

So if I get stick to the "vertical scalability"(Site has sessions), is it gonna be helpful for performance to run Twisted reactor on a single core machine vs multi-core machine (after all Python itself has a Global Interpreter Lock)? OR the entire "TwsitedGateway+listenSSL+Site+reactor" USAGE should be re-designed for the project? What about 64bit machine influence on Twisted? Quoting "Reza Lotun" <rlotun@gmail.com>:

...
thank you for such detailed response. I feel, finally I've succeed to express my original question correctly.

So if I go one step forward, and lets assume that indeed there is such limit of concurrent connections, THAN: should it be resolved by another architecture or another usage type of Twisted technology or something else?

Again, I don't think there are any universal answers to this question. It depends on what you're building. For example, say it's a REST api, which by design is stateless (i.e. no sessions). Then you can stick a load balancer in front (if you're on EC2 amazon has an "elastic load balancer" service for this) and load balance amongst many machines. As you find traffic increases you simply add more machines. This is called "horizontal scalability" and, as you might imagine, its highly desirable.

Another form is "vertical scalability" - that involves getting a faster computer to run your server on. This might work for some cases, but not in general - it seems to be the method applied to scaling RDBMSs, before going down the road of master/slave setups, sharding and denormalization.

Of course, you *could* use a different technology entirely when you need to scale really high. This might make sense if your'e a small company and growing - say you start out as a small team, and you need something up quickly that's fairly decent. You happen to know python so you roll the whole thing out in Twisted. As time progresses, you may rewrite certain systems in, say, erlang or something and move forward.

So, it's hard to say, really. At least, I'd like to know myself ;-) That's what makes the wheel field so interesting - there's a certain creative element to scalable systems.

Cheers, Reza

-- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply

Sign in to reply online Use email software

Reza Lotun

November 2009

1:09 p.m.

Why would you want multiple reactors? The only reason would be to have one per CPU core. For that the simplest thing would be to manually start n twistd processes, one per core and have a reverse proxy process listening on a port and distributing connections to each twistd process. Of course, you can extend this architecture to multiple machines. Cheers, Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

1:23 p.m.

How reverse proxy knows to balance between couple of twistd processes? Do you have may be an example(or doc link) of such proxy that can distributing connections to each twistd process? Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

Reza Lotun

1:40 p.m.

This is a classic distributed systems architecture. A reverse proxy can either something like haproxy, nginx, apache, perlbal or whatever (even another twisted process). The twistd processes can be seen as simply other machines on a LAN - instead they all have the 127.0.0.1:<port> where the port is different for each process. I don't have a more comprehensive example, but there are many many examples of reverse proxying servers all over the place - google "nginx reverse proxy" for many examples. Hope that helps. Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

1:53 p.m.

You right, distributed systems architecture. Probably I need to rephrase my question more precisely: how to build distributed system architecture with Twisted technology only ? As you mentioned, even reverse proxy could be a another twisted process. Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

Kevin Horn

2:11 p.m.

On Thu, Nov 12, 2009 at 12:53 PM, <vitaly@synapticvision.com> wrote:

You might look at txLoadBalancer: https://launchpad.net/txloadbalancer Kevin Horn

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

10:59 a.m.

Doesn't the event loop have a limit of connections it could handle? Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

exarkun＠twistedmatrix.com

November 2009

11:04 a.m.

On 03:59 pm, vitaly@synapticvision.com wrote:

Doesn't the event loop have a limit of connections it could handle?

Multiple reactors isn't a realistic solution to this. The solution is to switch to an event loop that has a higher limit. "The" event loop is actually a choice of many possible event loops. So connection limits aren't a good reason to want multiple reactors. Jean-Paul

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

11:44 a.m.

I've get confused enough already :-)) Once there is a Site that serving many clients and reactor.listenSSL(), for example, that actually serving many TCP connections and all these going through TwistedGateway, my logic, please correct me if I wrong, says at some point there will be a limit on concurrent TCP connections, so how is it solved with Twisted? Quoting exarkun@twistedmatrix.com:

Reply

Sign in to reply online Use email software

Reza Lotun

12:08 p.m.

That's a good question and I'm not sure if there's a definite answer to this (as far as I know). I think it depends on your application - for example, if your server is performing a big computation then on average client connections will last longer, meaning you'll have more concurrent connections. The best way to determine this is to *measure* it - for example, you can do a load test with httperf and ramp up connections until this start to break or become unresponsive. You can mitigate the situation by tuning your platform a bit (assuming you're using linux) - use the epoll reactor, which is high performance - make sure the number of open file descriptors is set to something high (and *not* 1024) - see `ulimit -a` - make sure you tune your tcp settings - see /etc/sysctl.conf, namely fs.file-max and various net.ipv4 settings (google is your friend on the best settings, coupled with testing) Cheers, Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

12:25 p.m.

thank you for such detailed response. I feel, finally I've succeed to express my original question correctly. So if I go one step forward, and lets assume that indeed there is such limit of concurrent connections, THAN: should it be resolved by another architecture or another usage type of Twisted technology or something else? Quoting "Reza Lotun" <rlotun@gmail.com>:

Reply

Sign in to reply online Use email software

Reza Lotun

12:44 p.m.

Again, I don't think there are any universal answers to this question. It depends on what you're building. For example, say it's a REST api, which by design is stateless (i.e. no sessions). Then you can stick a load balancer in front (if you're on EC2 amazon has an "elastic load balancer" service for this) and load balance amongst many machines. As you find traffic increases you simply add more machines. This is called "horizontal scalability" and, as you might imagine, its highly desirable. Another form is "vertical scalability" - that involves getting a faster computer to run your server on. This might work for some cases, but not in general - it seems to be the method applied to scaling RDBMSs, before going down the road of master/slave setups, sharding and denormalization. Of course, you *could* use a different technology entirely when you need to scale really high. This might make sense if your'e a small company and growing - say you start out as a small team, and you need something up quickly that's fairly decent. You happen to know python so you roll the whole thing out in Twisted. As time progresses, you may rewrite certain systems in, say, erlang or something and move forward. So, it's hard to say, really. At least, I'd like to know myself ;-) That's what makes the wheel field so interesting - there's a certain creative element to scalable systems. Cheers, Reza -- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

Reply

Sign in to reply online Use email software

vitaly＠synapticvision.com

1:06 p.m.

So if I get stick to the "vertical scalability"(Site has sessions), is it gonna be helpful for performance to run Twisted reactor on a single core machine vs multi-core machine (after all Python itself has a Global Interpreter Lock)? OR the entire "TwsitedGateway+listenSSL+Site+reactor" USAGE should be re-designed for the project? What about 64bit machine influence on Twisted? Quoting "Reza Lotun" <rlotun@gmail.com>:

...
thank you for such detailed response. I feel, finally I've succeed to express my original question correctly.

So if I go one step forward, and lets assume that indeed there is such limit of concurrent connections, THAN: should it be resolved by another architecture or another usage type of Twisted technology or something else?

Again, I don't think there are any universal answers to this question. It depends on what you're building. For example, say it's a REST api, which by design is stateless (i.e. no sessions). Then you can stick a load balancer in front (if you're on EC2 amazon has an "elastic load balancer" service for this) and load balance amongst many machines. As you find traffic increases you simply add more machines. This is called "horizontal scalability" and, as you might imagine, its highly desirable.

Another form is "vertical scalability" - that involves getting a faster computer to run your server on. This might work for some cases, but not in general - it seems to be the method applied to scaling RDBMSs, before going down the road of master/slave setups, sharding and denormalization.

Of course, you *could* use a different technology entirely when you need to scale really high. This might make sense if your'e a small company and growing - say you start out as a small team, and you need something up quickly that's fairly decent. You happen to know python so you roll the whole thing out in Twisted. As time progresses, you may rewrite certain systems in, say, erlang or something and move forward.

So, it's hard to say, really. At least, I'd like to know myself ;-) That's what makes the wheel field so interesting - there's a certain creative element to scalable systems.

Cheers, Reza

-- Reza Lotun mobile: +44 (0)7521 310 763 email: rlotun@gmail.com work: reza@tweetdeck.com twitter: @rlotun

_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply

Sign in to reply online Use email software

5600

Age (days ago)

5601

Last active (days ago)

Download

12 comments

4 participants

tags

participants (4)

exarkun＠twistedmatrix.com
Kevin Horn
Reza Lotun
vitaly＠synapticvision.com