Threading problem / Paramiko problem ?

mk mrkafk at gmail.com
Mon Dec 28 10:36:35 EST 2009


Hello everyone,

I wrote "concurrent ssh" client using Paramiko, available here: 
http://python.domeny.com/cssh.py

This program has a function for concurrent remote file/dir copying 
(class SSHThread, method 'sendfile'). One thread per host specified is 
started for copying (with a working queue of maximum length, of course).

It does have a problem with threading or Paramiko, though:

- If I specify, say, 3 hosts, the 3 threads started start copying onto 
remote hosts fast (on virtual machine, 10-15MB/s), using somewhat below 
100% of CPU all the time (I wish it were less CPU-consuming but I'm 
doing sending file portion by portion and it's coded in Python, plus 
there are other calculations, well..)

- If I specify say 10 hosts, copying is fast and CPU is under load until 
there are 2-3 threads left; then, CPU load goes down to some 15% and 
copying gets slow (at some 1MB/s).

It looks as if the CPU time gets divided in more or less even portions 
for each thread running at the moment when the maximum number of threads 
is active (10 in this example) *and it stays this way even if some 
threads are finished and join()ed *.

I do join() the finished threads (take a look at code, someone). Yet the 
CPU consumption and copying speed go down.

Now, it's either that, or Paramiko "maxes out" sending bandwidth per 
thread to the "total divided by number of senders". I have no idea which 
and what's worse, no idea how to test this. I've done profiling which 
indicated nothing, basically all function calls except time.sleep take 
negligible time.

Regards,
mk




More information about the Python-list mailing list