Re: [Twisted-Python] Multicast XMLRPC

Aug. 26, 2006

      Chaz. wrote:
...
Now let me address the issue of TCP. It is a pretty heavy protocol to 
use. It takes a lot of resources on the sender and target and can take 
some time to establish a connection. Opening a 1000 or more sockets 
consumes a lot of resources in the underlying OS and in the Twisted client!
People keep trying to help you, and you keep repeating yourself. From 
what I can gather:

You *need* a relatively lightweight group communication method. My 
advice would be to investigate a message bus - see recent posts on this 
mailing list. "Spread" at www.spread.org and ActiveMQ (via the simple 
text-over-tcp-based STOMP protocol). Reports are that both can (under 
the right conditions) execute many thousands of group messages per second.

Failing that, Glyph has hinted at another approach. You could elect a 
small number (~1%) of your nodes as "proxies" so that as well as being 
clients, they act as intermediaries for messages. This is a simple form 
of overlay network, which you also stated you didn't want to use - lord 
knows why. People use these techniques for a reason - they work.

You *want* (have decided you want) a reliable multicast protocol over 
which you'll layer a simple RPC protocol. RMT (reliable multicast 
transport) is as yet an unsolved problem. It is VERY VERY hard. None 
exist for Twisted, to the best of my knowledge. I would be willing to 
bet money that, for "thousands" of nodes, the overhead of implementing 
such a protocol (in Python, one presumes) would exceed the overhead of 
just using TCP. If you had said "hundreds of thousands" of nodes, well, 
that would be different.

If you want to knock an RMT up based on the assumption you won't drop 
packets, then be my guest, but I would suggest that if you *really* 
believe multicast is that reliable, then your experience of IP multicast 
networks has been a lot more rosy than mine, and I run a very large one.

"reliable multicast" into google would be a good start - there are some 
good RFCs produced the the rmt IETF working group.
...
If I use TCP and stick to the serial, synchronized semantics of RPC, 
doing one call at a time, I have only a few ways to solve the problem. 
Do one call at a time, repeat N times, and that could take quite a 
while. I could do M spawnProcesses and have each do N/M RPC calls. Or I 
could use M threads and do it that way. Granted I have M sockets open at 
a time, it is possible for this to take quite a while to execute. 
Performance would be terrible (and yes I want an approach that has good 
to very good performance. After all who would want poor to terrible 
performance?)
Knuth and his comments on early optimisation apply here. Have you tried 
it? You might be surprised.

I have some Twisted code that does SNMP to over a thousand devices. This 
is, obviously, unicast UDP. The throughput is very high. A simple 
ACK-based sequence-numbered UDP unicast will very likely scale to 
thousands of nodes.
...
So I divided the problem down to two parts. One, can I reduce the amount 
of traffic on the invoking side of the RPC request? Second, is how to 
deal with the response. Obviously I have to deal with the issue of 
failure, since RPC semantics require EXACTLY-ONCE.
How many calls per second are you doing, and approximately what volume 
of data will each call exchange?

You seem inflexible about aspects of the design. If if were me, I'd 
abandon RPC semantics. Smarter people than anyone here have argued 
convincingly against making a remote procedure call look anything like a 
local one, and once you abandon *that*, RPCs look like message exchanges.
...
That gets me to the multicast or broadcast scheme. In one call I could 
get the N processors to start working. Now I just have to solve the 
other half of the problem: how to get the answers returned without 
swamping the network or how to detect when I didn't get an answer from a 
processor at all.
That leads me to the observation that on an uncongested ethernet I 
almost always have a successful transmission. This means I have to deal
Successful transmission is really the easy bit for multicast. There is 
IGMP snooping, IGMP querier misbehaviour, loss of forwarding on an 
upstream IGP flap, flooding issues due to global MSDP issues, and so forth.
...
with that issue and a few others. Why do I care? Because I believe I can 
accomplish what I need - get great performance most of the time, and 
only in a few instances have to deal with do the operation over again.
This is a tough problem to solve. I am not sure of the outcome but I am 
sure that I need to start somewhere. What I know is that it is partly 
transport and partly marshalling. The semantics of the call have to stay 
fixed: EXACTLY-ONCE.
If you MUST have EXACTLY-ONCE group communication semantics, you should 
use a message bus.

Re: [Twisted-Python] Multicast XMLRPC

Phil Mayers