[Twisted-Python] How to use ampoule?

Hi everyone, I am using twisted to build a server,and the computing request maybe costs lot of cpu resources.I have checked the maillist,it seems I can use ampoule plugin to create another process ,I have checked the website,and downloaded the source code.It seems there is no any document for it.I did find a example dir,after checking the code,I was confused.There should be two files,one for client and one for server,am I right?the client will send the request to the server(could be in another machine?),and the server responses.but I can't find anything which matches what I thought.Can anyone explain a little bit to me?Or if there is some code that would be better. And the twisted server will receive lot of binary data from the client,if I use ampoule,I have to send the same data to the ampoule server again(now the twisted server acts as the ampoule client).Is the ampoule suitable for such a kind of task? Or I should just use twisted process module? I don't know my understanding is correct or not,please correct me. Best Wishes Chris

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Feb 20, 2009, at 2:08 AM, Chris wrote:
Hi everyone, I am using twisted to build a server,and the computing request maybe costs lot of cpu resources.I have checked the maillist,it seems I can use ampoule plugin to create another process ,I have checked the website,and downloaded the source code.It seems there is no any document for it.I did find a example dir,after checking the code,I was confused.There should be two files,one for client and one for server,am I right?the client will send the request to the server(could be in another machine?),and the server responses.but I can't find anything which matches what I thought.Can anyone explain a little bit to me? Or if there is some code that would be better.
I don't have much time to write documentation, I basically spend most of the time in documenting the code itself, in testing it and I attached some of the examples that one can use to learn the basics by himself but... You don't need 2 files, you need the code you want to run on both sides of the connection, Ampoule doesn't support shipping functions, although one might implement it on top of what ampoule offers, albeit not recommended. AMPoule uses AMP as a communication protocol between the caller and the process pool, this means that they talk to each other using twisted.protocols.amp and that for the abstraction sake they work as 2 separate networked services that make RPC calls to each other. Again this also means that in order to develop a process pool you will use AMP abstractions and classes on top of what AMPoule offers by default. Let's look at the simplest example: examples/pid.py What you need is a set of commands that a child process should be able to answer to, in pid.py this set is made of just a single command and that is: from twisted.protocols import amp class Pid(amp.Command): response = [("pid", amp.Integer())] Once you define a command that you want to be able to run in a process pool you need to define a child process that is able to answer to the Pid command. Defining this also defines what every worker in the process pool will be able to answer to. In the example this is done with the following lines: from ampoule import child class MyChild(child.AMPChild): @Pid.responder def pid(self): import os return {"pid": os.getpid()} We define a child of the process pool called MyChild and using AMP machinery we set the method pid as the responder for the command Pid. Unsurprisingly this command gets the pid and returns it. Now we need to run this code and use it. So far we have defined the server (ProcessPool) side of things. [NOTE: In order to run everything in a single file using "python filename.py" to start it we need to hack around python's import system, this is why util.mainpoint exist, only to allow the script to import itself, you'll notice that a script's name when started with "python filename.py" is not filename but __main__.] Here's the code @util.mainpoint def main(args): import sys from twisted.internet import reactor, defer from twisted.python import log log.startLogging(sys.stdout) from ampoule import pool @defer.inlineCallbacks def _run(): pp = pool.ProcessPool(MyChild, min=1, max=1) yield pp.start() result = yield pp.doWork(Pid) print "The Child process PID is:", result['pid'] yield pp.stop() reactor.stop() reactor.callLater(1, _run) reactor.run() Besides all the standard twisted imports and 'boilerplate' code the core of the client is inside the _run function. It creates the process pool telling it to use MyChild as a specification for its children, we also tell it that the minimum size of the pool is 1 as well as the maximum size of the pool. Then the code proceeds to start it and once it's tarted we can submit commands and this is done with: result = yield pp.doWork(Pid) or result = yield pp.callRemote(Pid) The script then prints the result, stops the pool and exits. Strictly speaking the client of the process pool only needs to know the commands that he wants to execute, and which pool (if there is more than one) can execute it. This however is only true when the server is started separately from the client using "twistd ampoule" plugin. If your clients starts the pool by itself it needs to know also the class that defines the children protocol. The twistd ampoule plugin does nothing more than taking a child and parent class (the parent class defines what the master of the process pool can speak so that children can make calls against it if needed) and using some hacks exposes the same exact interface across the network using the parameters that you pass in the cli.
And the twisted server will receive lot of binary data from the client,if I use ampoule,I have to send the same data to the ampoule server again(now the twisted server acts as the ampoule client).Is the
Correct. I don't see any other way to do this except maybe if you have a distributed filesystem, in which case you might want to just save the data on the server and then make the right calls on the remote pools that have access to this distributed filesystem so that they can process the shards that they have access to.
ampoule suitable for such a kind of task? Or I should just use twisted process module?
Using the process module is not really different, the only issue you might find is related to the limit imposed by AMP of 64KB of data that can be transmitted in a single call to a given child. So for example pool.callRemote(Command, argument="I'm a bit string of more than 64KB") won't work but: pool.callRemote(Command, argument="I'm a bit string of LESS than 64KB") will. There are multiple ways in which you can solve this issue but essentially I wouldn't use ampoule to transport big quantities of data, you'd better use a distributed filesystem or actually an http server where you store data from the server and each worker grabs data from.
I don't know my understanding is correct or not,please correct me.
It seems you got it right. HTH - -- Valentino Volonghi aka Dialtone Now running MacOS X 10.5 Home Page: http://www.twisted.it http://www.adroll.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iEYEARECAAYFAkmfN88ACgkQ9Llz28widGXE9ACgk2OAlXK0cVP5/5tINoFAD70C Zc8Anig2L8GCNklG83a6la4x/hksFozW =SS7Z -----END PGP SIGNATURE-----
participants (2)
-
Chris
-
Valentino Volonghi