Python Thread Question
Anand B Pillai
abpillai at lycos.com
Thu Apr 17 15:34:11 CEST 2003
I have written this application which is a kind of intranet
web-spider. It crawls a given url and retrives the files in
the url and saves it to the disk.
Now when I do this using multiple threads(python threads),
assigning each url to a thread I find that the download gets
completed faster than if it were in a single thread. I assume
that the reason for this must be simple, that when you use
a single thread idiom, the app has to wait till a file is
downloaded. Whereas if you use a thread for each download,
the app can spawn other threads for other downloads, so no
wait is needed. I am firing off a group of threads (limited
by a maxthread count) and pooling them in a threadgroup.
Once the threads are fired for download, the app does not
try to control them until they finish/killed or a network
Ideally speaking, multithreading need not improve the speed
of an application but in examples like this which involve
bottlenecks like network traffic, it does. My questions about
1. Does python threads work only if the native platform supports
threading ? i.e, is python firing 'C' threads which in turn
fire the platform API threads (Win32 for windows/ pthreads for
2. Can a software API (Win32/pthreads) do multithreading even if
the CPU does not support multithreading ? (might seem like a
superfluous question when almost all cpus does in this age, but
the question is still valid). Or is multithreading ultimately
related to how the CPU handles threads ?
3. Is the apparent increase in speed in my program using multiple
threads attributable to the CPU or the platform API or python ?
4. Can I safely say that multithreading will improve my application
performance if it has similar work to do on many resources at the
same time ? (egs: a web parser/ spider/ a disk-to-disk file copier/
directory synchronizer) Or does it depend upon the nature of the
task at hand ?
Well, that is quite a lot.
Thanks for your help,
More information about the Python-list