
On 2008.10.31 16:52:26 +1100, Blair Bethwaite wrote:
I'm thinking about using Twisted to rewrite some communications in a Grid computing application. E.g.:
[JobServer]1 ---------- *[Proxy]1 ----------- *[Agents]
Agents get jobs, report status, results, etc through Proxy to JobServer. Agents are often distributed across a private network that has no external interface, hence the proxy, which is run on a machine between the private network and internet. Also, the proxy may do some caching/queuing of particular messages - especially where the number of agents behind it is large.
What we have at present is a TCP socket server however we're starting to hit scalability issues and on top of that our future plans necessitate a notification framework. I've been digging around on twistedmatrix.com and in the book but so far am having trouble finding any guidance or the level of detail that tells me whether Twisted has what we need, some bits of which are:
- we'd like to use a persistent stream/connection, at least between the JobServer and Proxy (traffic frequency will be reasonably high) - it needs to be interoperable with java (is there PB for java?)
There is a Java version of PB. http://itamarst.org/software/twistedjava/ I've never used the Java version, so I can't say whether it works well. The Python version is excellent.
- each end of the connection should be able to register for and get notifications from the other (e.g. Agent gives a heartbeat, JobServer tells Agent to stop) - sometimes the Proxy might be behind a firewall and only able to connect out, we need to be able to use that connection to go back as above
PB allows both ends of the connection to send and receive at any time, over a single connection. I use it for a game where multiple clients connect via TCP to a server, and then the clients send messages to the server whenever they want, and the server sends messages to one or more clients whenever it wants (over the original connections initiated by the clients, so the clients don't need to open any holes in their firewalls), and everything just stinking works.
- we want to dynamically configure streaming connections between Proxies
Yes, you can add more connections at any time. You have to write your own simple routing code.
Hopefully that's enough context for you to point at modules and tell whether these are appropriate questions: Can I re-use a TCP stream for multiple XML-RPC or PB operations?
PB, yes. That's not the way XML-RPC is typically done. Clearly you could make a protocol that uses XML-RPC payload over a persistent connection. But you'd lose the ability to use arbitrary XML-RPC libraries unmodified, which is probably the main benefit of choosing XML-RPC. Also, XML-RPC is an inherently rigid request/response protocol and thus fails some of your other requirements. If the client and server are both asynchronously initiating requests over the same connection, you need additional information to distinguish a new request from a response to the other side's last request. And if you send multiple requests over the same connection without waiting for a response to the first, you need to send more information correlating requests with responses. All doable, but not XML-RPC anymore.
If so, would it make sense to have the client (e.g. Proxy or Agent) initiate the connection and then make a rpc (e.g. "notify me of anything relevant") to the server which would eventually return when something relevant came along - triggering a callback in the client... wash, rinse, repeat?
Yes, you make a remote call which returns a deferred. You attach a callback and an errback to the deferred. When the remote call finishes, either your callback or your errback fires, with your return value or exception information. I recommend writing something like a small chat system first, using the exact protocol you're considering, before tackling your real problem. If you can get Java clients and Python clients chatting through a chat proxy that can forward through other chat proxies, then you know you can make it work. When you write a little chatbot and run lots of instances in parallel and nothing chokes, then you know you can make it scale. If not for the Java requirement, I would say that Twisted is a good fit, and that you could use either PB or AMP, depending on whether you want to pass around complex types or simple ones. But if you need Java, I don't know. -- David Ripton dripton@ripton.net