[Twisted-Python] Backup Agent

OK Guys I Need your help. I own a small computer consulting firm and am struggling with the basic sales and marketing issues. However, I think I have finally stumbled onto a great "no brainer" idea: A Remote Backup Service for Small businesses. Yes I know there are a bunch out their, but I am thinking of the Hundreds of Businesses around my general geographic area who don't know how to find them and who don't backup their data. I would charge a monthly fee for a given amount of space used ( no more than 500 Meg) and provide a software that would backup and restore their data. My dilemma is that my margins would be to small to make it affordable for me to resale someone else's software. So....I would like to roll my own. Python makes a lot of sense for many reasons that I don't need to get into here. My struggle is that although I have been programming since the mid-70s, I am new to Python and feel like a man groping around in the dark. Also being new to Twisted it is like the room is full of chairs as well. I did get some sample code to work over the weekend, but I got a serious brain cramp trying to figure it all out. The basic architecture that I am thinking of is client/server. There would be three basic applications running to make things work. First, would be an Agent program that does most of the work. The Agent would run as a service on each computer being backed up. Second would be a GUI program running locally to control the agents. Finally there would be a "Web Service" that received backup sets from each agent or a designated master agent. The Web Service would handle authentication and other house keeping things. Finally the Question!!!! There are many other details that I have in mind as well (using patches to reduce size, compression, encryption...), but right now I am working on issues related to the communication between the GUI and Agents. Assuming that I have one or more number of agents running on a network I want to do a number of things. 1) Have the GUI request that the agents identify themselves so I don't have to tell the GUI where they are. 2) Transmit a catalog list of the file system to the GUI so I can build a TreeView and pick which files to backup on each computer. 3) Finally I may want to identify a "Master" agent that collects all the backups on the network and transmits them as one session to the Web Service. This would allow for local "On Site" backups for fast restores due to normal computer crashes and failures. 1. What would you guys suggest I do to I get the GUI to ask all the agents to identify themselves? 2. What would be the best approach using twisted to transmit a file system catalog across the local wire. I was playing with some remote procedure calls over the weekend that seemed to work. I am thinking I could create a class that would gather the local file system and send it over the wire as one lump. 3. If I am going to transmit a backup file (<500meg) over the wire to another local computer what is the best Twisted approach to take? Due to WinXP security issues and the desire the run in a cross platform environment I don't want to rely on normal file shares. There are a hundred other questions running though my mind right now, but this will due for now, Thanks for any help you can give, David A. Leedom The Hightower Group, Inc. Custom Software Solutions Designed To Fit Your Business Like A Glove. 165 West Airport Road/Lititz, PA 17543 V:717-560-4002, 877-560-4002 x: 114 F:717-560-2825 www.hightowergroup.com

David A. Leedom [Mon, Feb 16, 2004 at 10:03:37AM -0500]:
1. What would you guys suggest I do to I get the GUI to ask all the agents to identify themselves?
Well, that's a good question. In some app I did, I used machine's IP address + some operating system info it run (sys.platform, os.uname) to build a kind of machine identifier. You could as well generate some UID, then save it, then send it together with machine's IP to identify it on server, by hand. Using OS-specific procedures could help you with this.
Remote procedures have limit of 640 KB for their parameters (because "640 KB should be enough to anyone"). You should use Pager and Collector.
3. If I am going to transmit a backup file (<500meg) over the wire to another local computer what is the best Twisted approach to take?
See above. Also, please remember, that memory issues matter. If you're paging the data (send small chunks), you should save them on the backup server, as they are received. If you are transfering a big file, keeping all the data received in some buffer is a bad idea.

Michal Pasternak wrote:
David A. Leedom [Mon, Feb 16, 2004 at 10:03:37AM -0500]:
PB probably isn't good for massive file transfer like that. I'd either use an existing tool like rsync or http or whatever, or roll a trivial protocol that just sends length_of_data:data. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | Release Manager, Twisted Project ---------+ http://radix.twistedmatrix.com/

David A. Leedom [Mon, Feb 16, 2004 at 10:03:37AM -0500]:
1. What would you guys suggest I do to I get the GUI to ask all the agents to identify themselves?
Well, that's a good question. In some app I did, I used machine's IP address + some operating system info it run (sys.platform, os.uname) to build a kind of machine identifier. You could as well generate some UID, then save it, then send it together with machine's IP to identify it on server, by hand. Using OS-specific procedures could help you with this.
Remote procedures have limit of 640 KB for their parameters (because "640 KB should be enough to anyone"). You should use Pager and Collector.
3. If I am going to transmit a backup file (<500meg) over the wire to another local computer what is the best Twisted approach to take?
See above. Also, please remember, that memory issues matter. If you're paging the data (send small chunks), you should save them on the backup server, as they are received. If you are transfering a big file, keeping all the data received in some buffer is a bad idea.

Michal Pasternak wrote:
David A. Leedom [Mon, Feb 16, 2004 at 10:03:37AM -0500]:
PB probably isn't good for massive file transfer like that. I'd either use an existing tool like rsync or http or whatever, or roll a trivial protocol that just sends length_of_data:data. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | Release Manager, Twisted Project ---------+ http://radix.twistedmatrix.com/
participants (4)
-
Christopher Armstrong
-
David A. Leedom
-
Michal Pasternak
-
Tommi Virtanen