Simple thread pools
dave at pythonapocrypha.com
Fri Nov 5 16:54:44 CET 2004
Jacob Friis wrote:
>> Things as maximum number of file descriptors and generally speaking IO
>> operations and their limits are matters of the underlying OS, not of the
>> programming language itself. So show us a way in another language, and
>> we can tell you how to do that in python.
> I need to download 150000 files several times every day.
> How would you solve that?
Best bet is to use an asynchronous socket library (asyncore / asynchat / twisted
/ etc). On Linux it's fairly easy to increase the number of file descriptors
allowed per process, and if you manage your connections appropriately you can
have thousands of simultaneous open connections.
Elsewhere I noticed that you said the average object size is only around 15k, so
once you have the basic system working, you can probably improve performance
by focusing on things that lower transaction overhead - DNS caching, reusing
connections to servers (and pipelining HTTP requests to those servers), etc.,
but I wouldn't bother with that right away - better to get the core of the
stable and scalable first.
More information about the Python-list