[Twisted-Python] Agent and "Cannot assign requested address"
Hi All, I've got a situation where I'm using t.w.c.Agent to make 100,000 POST request against a server. Each time a new Agent instance is built and the request is sent using it. After about 20,000 requests, I get this error: Failure: twisted.internet.error.ConnectError: An error occurred while connecting: 99: Cannot assign requested address. Would building the Agent once and reusing the same instance avoid this? I assume I've run out of client ports. -J
On Thu, 2011-03-03 at 21:00 -0700, Jason J. W. Williams wrote:
Hi All,
I've got a situation where I'm using t.w.c.Agent to make 100,000 POST request against a server. Each time a new Agent instance is built and the request is sent using it. After about 20,000 requests, I get this error:
Failure: twisted.internet.error.ConnectError: An error occurred while connecting: 99: Cannot assign requested address.
Would building the Agent once and reusing the same instance avoid this? I assume I've run out of client ports.
Yes... except it doesn't support persistent connections yet. Do you actually need to run all 100,000 in parallel? If not, set a cap on how many requests can run in parallel.
On 01:29 pm, itamar@itamarst.org wrote:
On Thu, 2011-03-03 at 21:00 -0700, Jason J. W. Williams wrote:
Hi All,
I've got a situation where I'm using t.w.c.Agent to make 100,000 POST request against a server. Each time a new Agent instance is built and the request is sent using it. After about 20,000 requests, I get this error:
Failure: twisted.internet.error.ConnectError: An error occurred while connecting: 99: Cannot assign requested address.
Would building the Agent once and reusing the same instance avoid this? I assume I've run out of client ports.
Yes... except it doesn't support persistent connections yet. Do you actually need to run all 100,000 in parallel? If not, set a cap on how many requests can run in parallel.
It's worse than just "in parallel". After the connection closes, it moves to TIME_WAIT for two minutes. These count towards the limit as well. Jean-Paul
On Fri, 2011-03-04 at 13:33 +0000, exarkun@twistedmatrix.com wrote:
It's worse than just "in parallel". After the connection closes, it moves to TIME_WAIT for two minutes. These count towards the limit as well.
Oh right: http://twistedmatrix.com/trac/ticket/1288 You could probably set that yourself with a little hacking until that ticket is fixed.
Actually, I think the TIME_WAIT is the problem. It's what I see in netstat, and the Agent requests are fired sequentially via yield inside a for loop (inlineCallbacks). So they shouldn't be running in parallel. The use case here is loading a Riak server with keys to prepare for a test. There's not a real way to get around sending one POST per key. How would I set the timeout value in Twisted? Or do I have to modify the timeout/keepalive systemwide in /proc? -J Sent via iPhone Is your e-mail Premiere? On Mar 4, 2011, at 6:53, Itamar Turner-Trauring <itamar@itamarst.org> wrote:
On Fri, 2011-03-04 at 13:33 +0000, exarkun@twistedmatrix.com wrote:
It's worse than just "in parallel". After the connection closes, it moves to TIME_WAIT for two minutes. These count towards the limit as well.
Oh right: http://twistedmatrix.com/trac/ticket/1288
You could probably set that yourself with a little hacking until that ticket is fixed.
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On 05:09 pm, jasonjwwilliams@gmail.com wrote:
Actually, I think the TIME_WAIT is the problem. It's what I see in netstat, and the Agent requests are fired sequentially via yield inside a for loop (inlineCallbacks). So they shouldn't be running in parallel.
The use case here is loading a Riak server with keys to prepare for a test. There's not a real way to get around sending one POST per key.
How would I set the timeout value in Twisted? Or do I have to modify the timeout/keepalive systemwide in /proc?
As far as I know, there are only system-wide settings for this value on all the major platforms. It seems like you'll be happiest using persistent connections, though, once Agent actually offers those. Jean-Paul
Yeah. Actually that's the reason I refactored txRiak to use Agent instead of HTTPClient, so it could take advantage of pooling when that comes down the pike for Agent (well that and HTTP 1.1 support). I guess I'll just throttle down the load rate. Thank you for your help. -J On Fri, Mar 4, 2011 at 12:40 PM, <exarkun@twistedmatrix.com> wrote:
On 05:09 pm, jasonjwwilliams@gmail.com wrote:
Actually, I think the TIME_WAIT is the problem. It's what I see in netstat, and the Agent requests are fired sequentially via yield inside a for loop (inlineCallbacks). So they shouldn't be running in parallel.
The use case here is loading a Riak server with keys to prepare for a test. There's not a real way to get around sending one POST per key.
How would I set the timeout value in Twisted? Or do I have to modify the timeout/keepalive systemwide in /proc?
As far as I know, there are only system-wide settings for this value on all the major platforms.
It seems like you'll be happiest using persistent connections, though, once Agent actually offers those.
Jean-Paul
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
"Jason J. W. Williams" <jasonjwwilliams@gmail.com> writes:
Actually, I think the TIME_WAIT is the problem. It's what I see in netstat, and the Agent requests are fired sequentially via yield inside a for loop (inlineCallbacks). So they shouldn't be running in parallel.
`yield` returns before TIME_WAIT expires otherwise it would require ~1 minute per request.
The use case here is loading a Riak server with keys to prepare for a test. There's not a real way to get around sending one POST per key.
How would I set the timeout value in Twisted? Or do I have to modify the timeout/keepalive systemwide in /proc?
In addition to net.ipv4.tcp_fin_timeout you could increase the ephemeral port range (net.ipv4.ip_local_port_range sysctl parameter). Each connection can be identified using 4-tuple (server IP, server port, client IP, client port) Everything except client port is fixed in your case so there could be at most ~ net.ipv4.ip_local_port_range/net.ipv4.tcp_fin_timeout connections per second (even less in practice due to other applications and other settings taking preference such as fs.file-max). For example: net.ipv4.ip_local_port_range = 32768 61000 net.ipv4.tcp_fin_timeout = 30 There could be ~900 connections per second that might be good enough. Reusing a local port via SO_REUSEADDR or better yet reusing a tcp connection via HTTP keep-alive aren't available with twisted as I understand it. -- akira
On Mar 10, 2011, at 5:31 AM, akira wrote:
Reusing a local port via SO_REUSEADDR or better yet reusing a tcp connection via HTTP keep-alive aren't available with twisted as I understand it.
Reusing a local connection-oriented port with SO_REUSEADDR is potentially a bad idea; there's a reason that your TCP stack gives you this error. That option is practically only for listening ports. Keep-alive is a work in progress, previously mentioned in this thread: <http://twistedmatrix.com/trac/ticket/3420>.
I ended up getting around the problem by increasing my Riak cluster size and putting a load balancer in front for the test. But connection pooling would be really helpful, both here and in the CouchDB client. I've refactored both txRiak and Paisley in the past couple of months to use Agent in the hopes ticket 3420 gets completed. :) -J On Thu, Mar 10, 2011 at 3:31 AM, akira <4kir4.1i@gmail.com> wrote:
"Jason J. W. Williams" <jasonjwwilliams@gmail.com> writes:
Actually, I think the TIME_WAIT is the problem. It's what I see in netstat, and the Agent requests are fired sequentially via yield inside a for loop (inlineCallbacks). So they shouldn't be running in parallel.
`yield` returns before TIME_WAIT expires otherwise it would require ~1 minute per request.
The use case here is loading a Riak server with keys to prepare for a test. There's not a real way to get around sending one POST per key.
How would I set the timeout value in Twisted? Or do I have to modify the timeout/keepalive systemwide in /proc?
In addition to net.ipv4.tcp_fin_timeout you could increase the ephemeral port range (net.ipv4.ip_local_port_range sysctl parameter).
Each connection can be identified using 4-tuple (server IP, server port, client IP, client port) Everything except client port is fixed in your case so there could be at most ~ net.ipv4.ip_local_port_range/net.ipv4.tcp_fin_timeout connections per second (even less in practice due to other applications and other settings taking preference such as fs.file-max). For example:
net.ipv4.ip_local_port_range = 32768 61000 net.ipv4.tcp_fin_timeout = 30
There could be ~900 connections per second that might be good enough.
Reusing a local port via SO_REUSEADDR or better yet reusing a tcp connection via HTTP keep-alive aren't available with twisted as I understand it.
-- akira
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
participants (5)
-
akira
-
exarkun@twistedmatrix.com
-
Glyph Lefkowitz
-
Itamar Turner-Trauring
-
Jason J. W. Williams