Justin Mazzola Paluska jmp@MIT.EDU writes:
- Should I send the files from SRC to DEST one-by-one?
That's how I would do it. If you're talking about gigabyte-sized files, the protocol overhead will be pretty minimal compared to the data being transferred. You've got a couple of objects to keep track of for each file being sent, but on the other hand it will be a lot easier to keep track of how much progress you've made (and keep the user informed) that way.
- Or, is it better to use something like tarfile module to create a stream of bytes that I stream to the other side and decode?
I would recommend this approach if you had a bunch of small files. You want to run that 'tar cf - WHAT' child against a ProcessProtocol that reacted to dataReceived(data) by doing a rref.callRemote("moreDataForYou", data). You'd probably want to accumulate data into chunks of maybe 4k or so to increase efficiency. At the far end, your remote_moreDataForYou() call would write that data into the untarring ProcessProtocol. Take a look at doc/core/howto/process.xhtml for details on ProcessProtocols and reactor.spawnProcess.
- Finally, should I be doing something completely different? Normally, outside of my application, I'd just use rsync, scp, or some such.
I'd certainly investigate this method if the most of the files you are sending are already in place on the far end. The bandwidth savings are worth the extra setup hassle.
Is there a way to get rsync to speak to stdout/stdin instead of using a TCP socket? If so, you could spawnProcess('rsync') and proxy it to the far end over PB as with 'tar' above. Or, you could have your PB-connection-wielding process listen on a local TCP socket, then tell rsync to talk directly to that port, then do a socket-level proxy over PB to the far system.
Also remember that scp (or rsync-over-ssh or tar|ssh, etc) will be doing better authentication than PB, since PB is all in cleartext. Many applications don't require confidentiality, but before you switch from ssh to straight PB you should be aware of what exactly you're giving up.
<shameless plug> But, if you use NewPB, you get the strong authentication and confidentiality of ssh with all of the juicy RemoteReference model you've come to know and love from PB, check out NewPB today. </shameless plug>.