Why does using threads not speed up things?

Franz GEIGER fgeiger at datec.at
Thu Jan 25 03:33:36 EST 2001


I wrote a file synchronizer, which copies newer files to a backup
destination and there deletes files that don't exist anymore on the source.
The synch source and the synch destination are connected by a LAN. So I
thought it would be a good idea to scan the source and the destination in
parallel, using two threads. I expected a performance gain of let's say 30
%.

So I did a few tests and wrote a sample programm that gathers file names. It
does this in 2 tasks: Task 1 gathers the file names on the source, task 2 on
the destination. To be able to compare this version with the one is used
until now, I did it that way too: First scan the source, then scan the
destination. I called the former multithreaded version and the latter
singlethreaded version.

The singlethreaded version took 1900 seconds to execute, the multithreaded
version took 1600 seconds. The gain is 300 seconds, which is about 15 %
faster than the singlethreaded version.

If source and destination lies on the same machine, there is no significant
difference in performance.

As I said I'd expected more, because there is a lot of slow disk I/O
involved and data transfer over a 10MBit section of our LAN. Or did I miss
something?

Best regards
Franz GEIGER







More information about the Python-list mailing list