[Twisted-Python] Handling too many open file descriptors

I'm running an application that makes about 1300 snmp connections every minute; I'm using utils.getProcessOutput with snmpget because pysnmp throws an error when I try to run it. Now of course I get the Too many open files error, but is the best way to handle this increasing the limit on Linux or by implementing some sort of queue so that there are only x number of the snmpget processes at a time? Is there such a queue feature in twisted? Thanks, Jason

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Looks like a DeferredSemaphore might be your solution. Have a look at this article, which explains that + a lot more: http://oubiwann.blogspot.com/2008/06/async-batching-with-twisted-walkthrough... Arjan On 09/27/2010 04:45 PM, Landreville wrote:
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkygr8cACgkQigE4AbflYeoQMACdHfBS6lEA5qZnkgy/0qzBuW5V fnwAoLFSTSU4uogGToqDF6l8eYv+WUpP =5H8w -----END PGP SIGNATURE-----

On 02:45 pm, landreville@deadtreepages.com wrote:
It doesn't seem likely to me that it's useful to have a thousand smtpget processes running at once. But who knows, maybe it is. To actually know, you'll have to decide what your requirements are (how many do you actually have to run in parallel to get the throughput you need? how hard are you willing to thrash the machine you're running on?) and then measure various approaches and configurations. If you decide you want to limit the number of snmpget processes you launch, then you might find twisted.internet.defer.DeferredSemaphore useful (and since the API docs for it are messed up, make sure you find the "run" method, not just the "acquire" and "release" methods). Jean-Paul

On Mon, Sep 27, 2010 at 10:53 AM, <exarkun@twistedmatrix.com> wrote:
I've got to get two values (but both can be grabbed at the same time with snmpget) from each of the interfaces (and inserted into database) every minute. The machine is dedicated to this, so its cool with the processing. The only thing is finding out how much parallelism I need to get each one every minute. Technically I don't need any parallel processing right now as I can grab them serially in just under a minute (bash script) but there will be more interfaces added periodically so I need to get the fetch time down a little bit. I'm already playing with deferred semaphore which is neat, but I haven't looked at the stats for the fetching yet (whether it succeeds in grabbing the value every minute)

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Looks like a DeferredSemaphore might be your solution. Have a look at this article, which explains that + a lot more: http://oubiwann.blogspot.com/2008/06/async-batching-with-twisted-walkthrough... Arjan On 09/27/2010 04:45 PM, Landreville wrote:
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkygr8cACgkQigE4AbflYeoQMACdHfBS6lEA5qZnkgy/0qzBuW5V fnwAoLFSTSU4uogGToqDF6l8eYv+WUpP =5H8w -----END PGP SIGNATURE-----

On 02:45 pm, landreville@deadtreepages.com wrote:
It doesn't seem likely to me that it's useful to have a thousand smtpget processes running at once. But who knows, maybe it is. To actually know, you'll have to decide what your requirements are (how many do you actually have to run in parallel to get the throughput you need? how hard are you willing to thrash the machine you're running on?) and then measure various approaches and configurations. If you decide you want to limit the number of snmpget processes you launch, then you might find twisted.internet.defer.DeferredSemaphore useful (and since the API docs for it are messed up, make sure you find the "run" method, not just the "acquire" and "release" methods). Jean-Paul

On Mon, Sep 27, 2010 at 10:53 AM, <exarkun@twistedmatrix.com> wrote:
I've got to get two values (but both can be grabbed at the same time with snmpget) from each of the interfaces (and inserted into database) every minute. The machine is dedicated to this, so its cool with the processing. The only thing is finding out how much parallelism I need to get each one every minute. Technically I don't need any parallel processing right now as I can grab them serially in just under a minute (bash script) but there will be more interfaces added periodically so I need to get the fetch time down a little bit. I'm already playing with deferred semaphore which is neat, but I haven't looked at the stats for the fetching yet (whether it succeeds in grabbing the value every minute)
participants (3)
-
Arjan Scherpenisse
-
exarkun@twistedmatrix.com
-
Landreville