Do I have to use threads?
Jorgen Grahn
grahn+nntp at snipabacken.se
Fri Jan 8 09:21:38 EST 2010
On Wed, 2010-01-06, Gary Herron wrote:
> aditya shukla wrote:
>> Hello people,
>>
>> I have 5 directories corresponding 5 different urls .I want to
>> download images from those urls and place them in the respective
>> directories.I have to extract the contents and download them
>> simultaneously.I can extract the contents and do then one by one. My
>> questions is for doing it simultaneously do I have to use threads?
>>
>> Please point me in the right direction.
>>
>>
>> Thanks
>>
>> Aditya
>
> You've been given some bad advice here.
>
> First -- threads are lighter-weight than processes, so threads are
> probably *more* efficient. However, with only five thread/processes,
> the difference is probably not noticeable. (If the prejudice against
> threads comes from concerns over the GIL -- that also is a misplaced
> concern in this instance. Since you only have network connection, you
> will receive only one packet at a time, so only one thread will be
> active at a time. If the extraction process uses a significant enough
> amount of CPU time
I wonder what that "extraction" would be, by the way. Unless you ask
for compression of the HTTP data, the images come as-is on the TCP
stream.
> so that the extractions are all running at the same
> time *AND* if you are running on a machine with separate CPU/cores *AND*
> you would like the extractions to be running truly in parallel on those
> separate cores, *THEN*, and only then, will processes be more efficient
> than threads.)
I can't remember what the bad advice was, but here processes versus
threads clearly doesn't matter performance-wise. I generally
recommend processes, because how they work is well-known and they're
not as vulnerable to weird synchronization bugs as threads.
> Second, running 5 wgets is equivalent to 5 processes not 5 threads.
>
> And third -- you don't have to use either threads *or* processes. There
> is another possibility which is much more light-weight: asynchronous
> I/O, available through the low level select module, or more usefully
> via the higher-level asyncore module.
Yeah, that would be my first choice too for a problem which isn't
clearly CPU-bound. Or my second choice -- the first would be calling
on a utility like wget(1).
/Jorgen
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
More information about the Python-list
mailing list