[Twisted-Python] Twisted-friendly Message bus
![](https://secure.gravatar.com/avatar/426d6dbf6554a9b3fca1fd04e6b75f38.jpg?s=120&d=mm&r=g)
All, I've run into "this service needs to know about this event from this other service" once too often. It's message bus time, baby. However, many of the daemons I use are locally-written Twisted code, so a Twisted-friendly (async api) one would be nice. Note also that other non-Twisted code will need to connect to it, so rolling my own (aside from it being a bad idea from a NIH perspective) is out. I could of course wrap the message queue receiver for something well-proven like Spread into a thread. Then there's XMPP which is the darling of the anti-Java movement - though wrapping Twisteds' XMPP API to get a message bus might be annoying. Suggestions?
![](https://secure.gravatar.com/avatar/96a6fa70caad11789ce45b7096860447.jpg?s=120&d=mm&r=g)
I do it using "named singleton" classes and callbacks. Anyone that wants to share the queue merely asks for the class to be instantiated. There are 3 methods - registerCallback, removeCallback and sendEvent. Whatever arguments, short of a few, are passed to the callback routines. It is simple and has worked for me. I have thought about doing something more sophisticated so the message bus can extend to other machines, but I haven't done that yet. Chaz Phil Mayers wrote:
![](https://secure.gravatar.com/avatar/2c498e6b589e4a4318a8280da536fb36.jpg?s=120&d=mm&r=g)
Phil Mayers <p.mayers@imperial.ac.uk> writes:
You could go with a signal dispatcher mechanism. PyDispatcher works well, although the actual signal delivery is synchronous. But it has nice features such as weak references to receivers so no need to worry about cleanup when a receiver disappears. There's also a package, Louie, based on PyDispatcher which in theory added support for Twisted among other things. We're still using PyDispatcher (I don't recall exactly why I didn't want to move to Louie - maybe we were still on Python 2.2 at the time) ourselves. http://pydispatcher.sourceforge.net (also has a ref to Louie) -- David
![](https://secure.gravatar.com/avatar/f9b3b8f946adb12579b22425a2f29d7d.jpg?s=120&d=mm&r=g)
I like and use XMPP myself. Since it's a standard, there are multiple implementation choices, which is nice. And you get queueing for free. And besides having a bunch of client libraries available in all major programming languages, for testing you can just manually send messages using an IM client like GAIM. The only other major standard I know of is JMS, and that's tied to Java. Cheers, Christian On 6/16/06, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On Fri, 16 Jun 2006 17:25:22 +0100, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
Could you clarify whether you mean message passing within a single process or message passing between different processes on the same host or message passing between multiple hosts? Different responders in this thread seem to have assumed different things. Jean-Paul
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Chaz." <eprparadocs@gmail.com> To: "Phil Mayers" <p.mayers@imperial.ac.uk> Cc: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Monday, June 19, 2006 10:26 PM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
can you elaborate on this please? ideally, i'd like to see some reasonably hard data for this. not implying that you're wrong, mind you, but i was under a completely different impression wrt resource utilization (albeit without extensive firsthand experience). -p
![](https://secure.gravatar.com/avatar/96a6fa70caad11789ce45b7096860447.jpg?s=120&d=mm&r=g)
If you go to the Spread website and read the documentation and source code it mentions both the number of computer limitations and the performance hit. Secondly, the system uses a static file to list the machines that participate in the communication. First, it won't support more a 200 computers, or so. It appears that deep in the bowels of the code is a byte sized field! The code is pretty convoluted, so I never did manage to find out what specifically was the field they mentioned. If you are going to build a system that involves a hundred machines or less you are probably alright. Secondly, as for the performance a search will point out how costly reliable group communications are in general. I would suggest you load the software up and run the demo, tracking the messaging passing with Ethereal. Once this begins to scalable upwards to hundreds of machines you will see the problem. Third, the configuration of a Spread cluster is fixed based on a config file. This was a problem for me; I wanted my system to scale upward by just adding a machine to the cluster and not having to change a config file and rebooting the daemon machines! It doesn't make for a very reliable approach (after all if IM systems could do it why couldn't non-IM systems?) It wasn't until I hit the boundaries of the Spread system did I begin to think about all that was involved in designing a highly reliable, highly scalable system. In sketching out the design of a reliable group communication system similar to Spread did I begin to see the weaknesses in the solution. Now if you think Spread is the way to go, and it might be for small clusters of machines, you should do a little research into things like the SWIM protocol (of which I am doing a twisted implementation right now and will be releasing to the community) or other approaches to group-like communication (do a google of overlay networks, which I will be also releasing). You will find a lot of relevant information. In fact they will probably lead you to a group at Cornell (and a few people there) and some people at the U of I at Champaign. So, as I said Spread is fine for small clusters of machines with a fixed deployment. It has tolerable performance and overhead. If you find that you want to scalable upwards of 200 machines or up to thousands, Spread won't cut it. For those of you that like reading academic stuff that is grounded in reality point your browsers to google and find a copy of: A Scalable Services Architecture By Tudor Marian, Ken Birman and Robbert van Renesse All in the Dept of CS at Cornell (You might recognize the name Birman as the guy that started a lot of this stuff...his first system was called Isis) For what it is worth, Charles "Chaz" Wegrzyn Paul G wrote:
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On Wed, 28 Jun 2006 08:52:22 -0400, "Chaz." <eprparadocs@gmail.com> wrote:
Just to clarify in case anyone is confused, here is the Spread website: http://www.spread.org/ This is not related to the twisted.spread package. Jean-Paul
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Chaz." <eprparadocs@gmail.com> To: "Paul G" <paul-lists@perforge.com> Cc: "Twisted general discussion" <twisted-python@twistedmatrix.com>; "Phil Mayers" <p.mayers@imperial.ac.uk> Sent: Wednesday, June 28, 2006 8:52 AM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
yes, for my purposes, a small number is perfectly fine.
there isn't an incredible amount of cost for baseline group communication. the incremental cost is attached to several incremental levels of delivery, ordering and consistency guarantees. assuming i wanted delivery guarantees, but didn't care about ordering or consistency, i imagine the cost wouldn't be much higher than plain udp. with spread specifically, i would imagine there's also extra cost associated with handling the WAN case and paritioning, which i wouldn't need. however, that's something i could live with if it weren't too great.
there new version (currently in beta) removes this limitation.
although i hadn't seen swim (doesn't help much, in any case), i've done quite a bit of digging andread quite a few papers. i've got several use cases i'm trying to find the right solutions for: 1. a lightweight message bus (a la mq) for a cluster of machines on a low-latency lan; must be fast, light, usable from python, php and c, not implemented in java, and perferably have an async api. this would be used to implement a cluster-wide event service. 2. same as above, but with much relaxed requirements for use as a log message handling service (think sending apache logs through it to logging servers). 3. a lightweight group communications framework which would allow me to implement multiple finely tuned semantics for consistency and ordering guarantees. this would be used to implement distributed data structures. can assume low latency lan environments, doesn't need to care about partitions (this would eventually run over myrinet/infiniband/friends). unfortunately, the preponderance of work done in this area centers around 'overlay networks', p2p and other such beasts; this introduces complexities which make the solutions too heavy for my use. it appears that the stuff you're interested in is along the same lines; that doesn't mean it (or you) suck(s), just that it's not what i'm looking for ;] moreover, i would ideally like to use something very thin and hackable; spread is already too much of a monolithic black box for use case #3 - the dds stuff i want to do would allow me to select consistency guarantees from none, to modest to very stringent for each specific case, eg i wouldn't want to pay for serialization if i were replacing, say, php's session storage stuff and intended to store transient data. the berkley ninja/sedo stuff is the closest i've seen, but it's in java (not a huge problem, since i could port it) and the code appears to be unavailable (large, some would say insurmountabl, problem). if anyone's got any ideas, i'm all ears. it looks like i'll be implementing #1 on top of twisted (locally, not for twisted project proper) in the near future (unless the dev roadmap changes *again*), so i'm definitely eager to hear about alternatives to sticking this on top of pb (not the right tool for the job, imho) or spread. in this case, java solutions are out (i'm not an anti-java zealot, i won't discuss the reasoning here). -p
![](https://secure.gravatar.com/avatar/fe24473d748a78f2ae29cdbc11c293a4.jpg?s=120&d=mm&r=g)
Phil Mayers wrote:
I've run into "this service needs to know about this event from this other service" once too often. It's message bus time, baby.
I ran into similar problems at work where we had been using PB but found that it did not perform to expectation (or requirement). I've replaced the bulk of our PB communications with a STOMP[1] protocol implementation to talk to with ActiveMQ[2] servers. This has been great and I can easily push over 10000 messages/second now. And by switching over to ActiveMQ, we got a lot of other interesting features (much like what Spread might provide I guess), like topics (many to many communication rather than point to point), and their reliability and persistence features. I did have to write my own STOMP protocol implementation as the existing one for Python is not up to snuff. I may be able to release it back to the ActiveMQ people, but I'm not entirely sure yet. And sure .. it meant installing Java. But it also meant having something that worked well, performed beautifully, and gave me new features that are letting me re-architect our software to greatly simplify it and make it more reliable. - Bruce [1] http://www.activemq.org/site/stomp.html [2] http://www.activemq.org/site/home.html
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Bruce Mitchener" <bruce@cubik.org> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Wednesday, June 28, 2006 10:56 AM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
what were the problems with the existing implementation? is your implementation integrated with twisted? more info (and even... gasp... source) would be very sweet, seeing as this looks like my best option for certain things at the moment. -p
![](https://secure.gravatar.com/avatar/fe24473d748a78f2ae29cdbc11c293a4.jpg?s=120&d=mm&r=g)
Paul G wrote:
I did two sorts of Twisted integration, first, I use it have a Protocol subclass. Secondly, I have it using deferreds for request/reply type stuff. The main problem that I recall with the existing code was that it assumed that it could just do a blocking read and get everything read in at once. This clearly isn't true as a STOMP message may span multiple read()s from the network (and happens quite regularly under load). I need to check with work about releasing it. I'll try to do that tomorrow, but things are pretty crazy, so it might not happen until next week after the holiday. But really, it is a pretty simple protocol. The main tricks are in handling partial messages correctly. The other way would be to look at the C++ OpenWire and STOMP implementations and think about wrapping one of them with SWIG. I didn't go that route yet, but may for performance/features in the future. - Bruce
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Phil Mayers" <p.mayers@imperial.ac.uk> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Friday, June 30, 2006 4:08 AM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
apologies, i'm handicapped with outlook express these two weeks and didn't check the quotations for sanity. bruce mitchener wrote that of course. -p
![](https://secure.gravatar.com/avatar/96a6fa70caad11789ce45b7096860447.jpg?s=120&d=mm&r=g)
I do it using "named singleton" classes and callbacks. Anyone that wants to share the queue merely asks for the class to be instantiated. There are 3 methods - registerCallback, removeCallback and sendEvent. Whatever arguments, short of a few, are passed to the callback routines. It is simple and has worked for me. I have thought about doing something more sophisticated so the message bus can extend to other machines, but I haven't done that yet. Chaz Phil Mayers wrote:
![](https://secure.gravatar.com/avatar/2c498e6b589e4a4318a8280da536fb36.jpg?s=120&d=mm&r=g)
Phil Mayers <p.mayers@imperial.ac.uk> writes:
You could go with a signal dispatcher mechanism. PyDispatcher works well, although the actual signal delivery is synchronous. But it has nice features such as weak references to receivers so no need to worry about cleanup when a receiver disappears. There's also a package, Louie, based on PyDispatcher which in theory added support for Twisted among other things. We're still using PyDispatcher (I don't recall exactly why I didn't want to move to Louie - maybe we were still on Python 2.2 at the time) ourselves. http://pydispatcher.sourceforge.net (also has a ref to Louie) -- David
![](https://secure.gravatar.com/avatar/f9b3b8f946adb12579b22425a2f29d7d.jpg?s=120&d=mm&r=g)
I like and use XMPP myself. Since it's a standard, there are multiple implementation choices, which is nice. And you get queueing for free. And besides having a bunch of client libraries available in all major programming languages, for testing you can just manually send messages using an IM client like GAIM. The only other major standard I know of is JMS, and that's tied to Java. Cheers, Christian On 6/16/06, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On Fri, 16 Jun 2006 17:25:22 +0100, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
Could you clarify whether you mean message passing within a single process or message passing between different processes on the same host or message passing between multiple hosts? Different responders in this thread seem to have assumed different things. Jean-Paul
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Chaz." <eprparadocs@gmail.com> To: "Phil Mayers" <p.mayers@imperial.ac.uk> Cc: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Monday, June 19, 2006 10:26 PM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
can you elaborate on this please? ideally, i'd like to see some reasonably hard data for this. not implying that you're wrong, mind you, but i was under a completely different impression wrt resource utilization (albeit without extensive firsthand experience). -p
![](https://secure.gravatar.com/avatar/96a6fa70caad11789ce45b7096860447.jpg?s=120&d=mm&r=g)
If you go to the Spread website and read the documentation and source code it mentions both the number of computer limitations and the performance hit. Secondly, the system uses a static file to list the machines that participate in the communication. First, it won't support more a 200 computers, or so. It appears that deep in the bowels of the code is a byte sized field! The code is pretty convoluted, so I never did manage to find out what specifically was the field they mentioned. If you are going to build a system that involves a hundred machines or less you are probably alright. Secondly, as for the performance a search will point out how costly reliable group communications are in general. I would suggest you load the software up and run the demo, tracking the messaging passing with Ethereal. Once this begins to scalable upwards to hundreds of machines you will see the problem. Third, the configuration of a Spread cluster is fixed based on a config file. This was a problem for me; I wanted my system to scale upward by just adding a machine to the cluster and not having to change a config file and rebooting the daemon machines! It doesn't make for a very reliable approach (after all if IM systems could do it why couldn't non-IM systems?) It wasn't until I hit the boundaries of the Spread system did I begin to think about all that was involved in designing a highly reliable, highly scalable system. In sketching out the design of a reliable group communication system similar to Spread did I begin to see the weaknesses in the solution. Now if you think Spread is the way to go, and it might be for small clusters of machines, you should do a little research into things like the SWIM protocol (of which I am doing a twisted implementation right now and will be releasing to the community) or other approaches to group-like communication (do a google of overlay networks, which I will be also releasing). You will find a lot of relevant information. In fact they will probably lead you to a group at Cornell (and a few people there) and some people at the U of I at Champaign. So, as I said Spread is fine for small clusters of machines with a fixed deployment. It has tolerable performance and overhead. If you find that you want to scalable upwards of 200 machines or up to thousands, Spread won't cut it. For those of you that like reading academic stuff that is grounded in reality point your browsers to google and find a copy of: A Scalable Services Architecture By Tudor Marian, Ken Birman and Robbert van Renesse All in the Dept of CS at Cornell (You might recognize the name Birman as the guy that started a lot of this stuff...his first system was called Isis) For what it is worth, Charles "Chaz" Wegrzyn Paul G wrote:
![](https://secure.gravatar.com/avatar/7ed9784cbb1ba1ef75454034b3a8e6a1.jpg?s=120&d=mm&r=g)
On Wed, 28 Jun 2006 08:52:22 -0400, "Chaz." <eprparadocs@gmail.com> wrote:
Just to clarify in case anyone is confused, here is the Spread website: http://www.spread.org/ This is not related to the twisted.spread package. Jean-Paul
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Chaz." <eprparadocs@gmail.com> To: "Paul G" <paul-lists@perforge.com> Cc: "Twisted general discussion" <twisted-python@twistedmatrix.com>; "Phil Mayers" <p.mayers@imperial.ac.uk> Sent: Wednesday, June 28, 2006 8:52 AM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
yes, for my purposes, a small number is perfectly fine.
there isn't an incredible amount of cost for baseline group communication. the incremental cost is attached to several incremental levels of delivery, ordering and consistency guarantees. assuming i wanted delivery guarantees, but didn't care about ordering or consistency, i imagine the cost wouldn't be much higher than plain udp. with spread specifically, i would imagine there's also extra cost associated with handling the WAN case and paritioning, which i wouldn't need. however, that's something i could live with if it weren't too great.
there new version (currently in beta) removes this limitation.
although i hadn't seen swim (doesn't help much, in any case), i've done quite a bit of digging andread quite a few papers. i've got several use cases i'm trying to find the right solutions for: 1. a lightweight message bus (a la mq) for a cluster of machines on a low-latency lan; must be fast, light, usable from python, php and c, not implemented in java, and perferably have an async api. this would be used to implement a cluster-wide event service. 2. same as above, but with much relaxed requirements for use as a log message handling service (think sending apache logs through it to logging servers). 3. a lightweight group communications framework which would allow me to implement multiple finely tuned semantics for consistency and ordering guarantees. this would be used to implement distributed data structures. can assume low latency lan environments, doesn't need to care about partitions (this would eventually run over myrinet/infiniband/friends). unfortunately, the preponderance of work done in this area centers around 'overlay networks', p2p and other such beasts; this introduces complexities which make the solutions too heavy for my use. it appears that the stuff you're interested in is along the same lines; that doesn't mean it (or you) suck(s), just that it's not what i'm looking for ;] moreover, i would ideally like to use something very thin and hackable; spread is already too much of a monolithic black box for use case #3 - the dds stuff i want to do would allow me to select consistency guarantees from none, to modest to very stringent for each specific case, eg i wouldn't want to pay for serialization if i were replacing, say, php's session storage stuff and intended to store transient data. the berkley ninja/sedo stuff is the closest i've seen, but it's in java (not a huge problem, since i could port it) and the code appears to be unavailable (large, some would say insurmountabl, problem). if anyone's got any ideas, i'm all ears. it looks like i'll be implementing #1 on top of twisted (locally, not for twisted project proper) in the near future (unless the dev roadmap changes *again*), so i'm definitely eager to hear about alternatives to sticking this on top of pb (not the right tool for the job, imho) or spread. in this case, java solutions are out (i'm not an anti-java zealot, i won't discuss the reasoning here). -p
![](https://secure.gravatar.com/avatar/fe24473d748a78f2ae29cdbc11c293a4.jpg?s=120&d=mm&r=g)
Phil Mayers wrote:
I've run into "this service needs to know about this event from this other service" once too often. It's message bus time, baby.
I ran into similar problems at work where we had been using PB but found that it did not perform to expectation (or requirement). I've replaced the bulk of our PB communications with a STOMP[1] protocol implementation to talk to with ActiveMQ[2] servers. This has been great and I can easily push over 10000 messages/second now. And by switching over to ActiveMQ, we got a lot of other interesting features (much like what Spread might provide I guess), like topics (many to many communication rather than point to point), and their reliability and persistence features. I did have to write my own STOMP protocol implementation as the existing one for Python is not up to snuff. I may be able to release it back to the ActiveMQ people, but I'm not entirely sure yet. And sure .. it meant installing Java. But it also meant having something that worked well, performed beautifully, and gave me new features that are letting me re-architect our software to greatly simplify it and make it more reliable. - Bruce [1] http://www.activemq.org/site/stomp.html [2] http://www.activemq.org/site/home.html
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Bruce Mitchener" <bruce@cubik.org> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Wednesday, June 28, 2006 10:56 AM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
what were the problems with the existing implementation? is your implementation integrated with twisted? more info (and even... gasp... source) would be very sweet, seeing as this looks like my best option for certain things at the moment. -p
![](https://secure.gravatar.com/avatar/fe24473d748a78f2ae29cdbc11c293a4.jpg?s=120&d=mm&r=g)
Paul G wrote:
I did two sorts of Twisted integration, first, I use it have a Protocol subclass. Secondly, I have it using deferreds for request/reply type stuff. The main problem that I recall with the existing code was that it assumed that it could just do a blocking read and get everything read in at once. This clearly isn't true as a STOMP message may span multiple read()s from the network (and happens quite regularly under load). I need to check with work about releasing it. I'll try to do that tomorrow, but things are pretty crazy, so it might not happen until next week after the holiday. But really, it is a pretty simple protocol. The main tricks are in handling partial messages correctly. The other way would be to look at the C++ OpenWire and STOMP implementations and think about wrapping one of them with SWIG. I didn't go that route yet, but may for performance/features in the future. - Bruce
![](https://secure.gravatar.com/avatar/7433ccc4d72b41e859d7c3740b8cb178.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Phil Mayers" <p.mayers@imperial.ac.uk> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Friday, June 30, 2006 4:08 AM Subject: Re: [Twisted-Python] Twisted-friendly Message bus
apologies, i'm handicapped with outlook express these two weeks and didn't check the quotations for sanity. bruce mitchener wrote that of course. -p
participants (7)
-
Bruce Mitchener
-
Chaz.
-
christian simms
-
David Bolen
-
Jean-Paul Calderone
-
Paul G
-
Phil Mayers