<br><br><div class="gmail_quote">On Wed, Jul 21, 2010 at 1:58 PM, MinRK <span dir="ltr"><<a href="mailto:benjaminrk@gmail.com">benjaminrk@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<br><br><div class="gmail_quote"><div><div></div><div class="h5">On Wed, Jul 21, 2010 at 12:17, Brian Granger <span dir="ltr"><<a href="mailto:ellisonbg@gmail.com" target="_blank">ellisonbg@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div>On Wed, Jul 21, 2010 at 10:51 AM, MinRK <<a href="mailto:benjaminrk@gmail.com" target="_blank">benjaminrk@gmail.com</a>> wrote:<br>

><br>

><br>

> On Wed, Jul 21, 2010 at 10:07, Brian Granger <<a href="mailto:ellisonbg@gmail.com" target="_blank">ellisonbg@gmail.com</a>> wrote:<br>

>><br>

>> On Wed, Jul 21, 2010 at 2:35 AM, MinRK <<a href="mailto:benjaminrk@gmail.com" target="_blank">benjaminrk@gmail.com</a>> wrote:<br>

>> > I now have my MonitoredQueue object on git, which is the three socket<br>

>> > Queue<br>

>> > device that can be the core of the lightweight ME and Task models<br>

>> > (depending<br>

>> > on whether it is XREP on both sides for ME, or XREP/XREQ for load<br>

>> > balanced<br>

>> > tasks).<br>

>><br>

>> This sounds very cool.  What repos is this in?<br>

><br>

> all on my pyzmq master: <a href="http://github.com/minrk/pyzmq" target="_blank">github.com/minrk/pyzmq</a><br>

> The Devices are specified in the growing _zmq.pyx. Should I move them?  I<br>

> don't have enough Cython experience (this is my first nontrivial Cython<br>

> work) to know how to correctly move it to a new file still with all the<br>

> right zmq imports.<br>

<br>

</div>Yes, I think we do want to move them.  We should look at how mpi4py<br>

splits things up.  My guess is that we want to have the declaration of<br>

the 0MQ C API in a single file that other files can use.  Then have<br>

files for the individual things like Socket, Message, Poller, Device,<br>

etc.  That will make the code base much easier to work with.  But<br>

splitting things like this in Cython is a bit suble.  I have done it<br>

before, but I will ask Lisandro Dalcin the best way to approach it.<br>

For now, I would keep going with the single file approach (unless you<br>

want to learn about how to split things using pxi and pxd files).<br></blockquote><div> </div></div></div><div>I'd be happy to help split it up if you find out the best way to go about it.</div><div><div></div><div class="h5">

<div> </div></div></div></div></blockquote><div><br></div><div>OK, I a a bit behind on things from being sick, but I may look into this when I review+merge you branch.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="gmail_quote"><div><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div><div></div><div><br>

>><br>

>> > The biggest difference in terms of design between Python in the<br>

>> > Controller<br>

>> > picking the destination and this new device is that the client code<br>

>> > actually<br>

>> > needs to know the XREQ identity of each engine, and all the switching<br>

>> > logic<br>

>> > lives in the client code (if not the user exposed code) instead of the<br>

>> > controller - if the client says 'do x in [1,2,3]' they actually issue 3<br>

>> > sends, unlike before, when they issued 1 and the controller issued 3.<br>

>> > This<br>

>> > will increase traffic between the client and the controller, but<br>

>> > dramatically reduce work done in the controller.<br>

>><br>

>> But because 0MQ has such low latency it might be a win.  Each request<br>

>> the controller gets will be smaller and easier to handle.  The idea of<br>

>> allowing clients to specify the names is something I have thought<br>

>> about before.  One question though:  what does 0MQ do when you try to<br>

>> send on an XREP socket to an identity that doesn't exist?  Will the<br>

>> client be able to know that the client wasn't there?  That seems like<br>

>> an important failure case.<br>

><br>

> As far as I can tell, the XREP socket sends messages out to XREQ ids, and<br>

> trusts that such an XREQ exists. If no such id is connected, the message is<br>

> silently lost to the aether.  However, with the controller monitoring the<br>

> queue, it knows when you have sent a message to an engine that is not<br>

> _registered_, and can tell you about it. This should be sufficient, since<br>

> presumably all the connected XREQ sockets should be registered engines.<br>

<br>

</div></div>I guess I don't quite see how the monitoring is used yet, but it does<br>

worry me that the message is silently lost.  So you think 0MQ should<br>

raise on that?  I have a feeling that the identies were designed to be<br>

a private API thing in 0MQ and we are challenging that.<br></blockquote><div><br></div></div></div><div>I don't know what 0MQ should do, but I imagine the silent loss is based on thinking of XREP messages as always being replies. That way, a reply sent to a nonexistent key is interpreted as being a reply to a message whose requester is gone, and 0MQ presumes that nobody else would be interested in the result, and drops it. As far as 0MQ is concerned, you wouldn't want the following to happen:</div>


<div>A makes a request of B</div><div>A dies</div><div>B replies to A</div><div>B crashes because A didn't receive the reply</div><div><br></div><div>nothing went wrong in B, so it shouldn't crash.</div><div><br>


</div>

<div>For us, the XREP messages are not replies on the engine side (they are replies on the client side). We are using the identities to treat the engine-facing XREP as a keyed multiplexer. The result is that if you send a message to nobody, nobody receives it. It's not that nobody knows about it - the controller can tell, because it sees every message as it goes by, and knows what the valid keys are, but the send itself will not fail.  In the client code, you can easily check if a key is valid with the controller, so I don't see a problem here.</div>


<div><br></div></div></blockquote><div> </div><div>OK</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="gmail_quote"><div></div><div>The only source of a problem I can think of comes from the fact that the client has a copy of the registration table, and presumably doesn't want to update it every time.  That way, an engine could go away between the client's updates of the registration, and some requests would vanish.  Note that the controller still does receive them, and the client can check with the controller on the status of requests that are taking too long.  The controller can use a PUB socket to notify of engines coming/going, which would mean the window for the client to not be up to date would be very small, and it wouldn't even be a big problem if it happend, since the client would be notified that its request won't be received.</div>

<div class="im">


<div></div></div></div></blockquote><div><br></div><div>I think this approach makes sense.  At some level the same issue exists today for us in the twisted version.  If you do mec.get_ids(), that information could become stale at any moment in time.  I think this is a intrinsic limitation of the multiengine approach (MPI included).</div>

<div><br></div><div>Cheers,</div><div><br></div><div>Brian</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="gmail_quote"><div class="im"><div>

 </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div><br>

> To test:<br>

> a = ctx.socket(zmq.XREP)<br>

> a.bind('tcp://<a href="http://127.0.0.1:1234" target="_blank">127.0.0.1:1234</a>')<br>

> b = ctx.socket(zmq.XREQ)<br>

> b.setsockopt(zmq.IDENTITY, 'hello')<br>

> a.send_multipart(['hello', 'mr. b'])<br>

> time.sleep(.2)<br>

> b.connect('tcp://<a href="http://127.0.0.1:1234" target="_blank">127.0.0.1:1234</a>')<br>

> a.send_multipart(['hello', 'again'])<br>

> b.recv()<br>

> # 'again'<br>

><br>

>><br>

>> > Since the engines' XREP IDs are known at the client level, and these are<br>

>> > roughly any string, it brings up the question: should we have strictly<br>

>> > integer ID engines, or should we allow engines to have names, like<br>

>> > 'franklin1', corresponding directly to their XREP identity?<br>

>><br>

>> The idea of having names is pretty cool.  Maybe default to numbers,<br>

>> but allow named prefixes as well as raw names?<br>

><br>

><br>

> This part is purely up to our user-facing side of the client code. It<br>

> certainly doesn't affect how anything works inside. It's just a question of<br>

> what a valid `targets' argument (or key for a dictionary interface) would be<br>

> in the multiengine.<br>

<br>

</div>Any string or list of strings?<br></blockquote><div> </div></div><div>Well, for now targets is any int or list of ints. I don't see any reason that you couldn't use a string anywhere an int would be used. It's perfectly unambiguous, since the two key sets are of a different type.</div>


<div> </div><div>you could do:</div><div>execute('a=5', targets=[0,1,'odin', 'franklin474'])</div><div>and the _build_targets method does:</div><div><br></div><div>target_idents = []</div><div>for t in targets:</div>


<div>    if isinstance(t, int):</div><div>        ident = identities[t]</div><div>    if isinstance(t, str) and t in identities.itervalues():</div><div>        ident = t</div><div>    else:</div><div>        raise KeyError("bad target: %s"%t)</div>


<div>    target_idents.append(t)</div><div>return target_idents</div><div class="im"><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div><br>

>><br>

>> > I think people might like using names, but I imagine it could get<br>

>> > confusing.<br>

>> >  It would be unambiguous in code, since we use integer IDs and XREP<br>

>> > identities must be strings, so if someone keys on a string it must be<br>

>> > the<br>

>> > XREP id, and if they key on a number it must be by engine ID.<br>

>><br>

>> Right.  I will have a look at the code.<br>

>><br>

>> Cheers,<br>

>><br>

>> Brian<br>

>><br>

>> > -MinRK<br>

>> ><br>

>> ><br>

>><br>

>><br>

>><br>

>> --<br>

>> Brian E. Granger, Ph.D.<br>

>> Assistant Professor of Physics<br>

>> Cal Poly State University, San Luis Obispo<br>

>> <a href="mailto:bgranger@calpoly.edu" target="_blank">bgranger@calpoly.edu</a><br>

>> <a href="mailto:ellisonbg@gmail.com" target="_blank">ellisonbg@gmail.com</a><br>

><br>

><br>

<br>

<br>

<br>

</div>--<br>

<div><div></div><div>Brian E. Granger, Ph.D.<br>

Assistant Professor of Physics<br>

Cal Poly State University, San Luis Obispo<br>

<a href="mailto:bgranger@calpoly.edu" target="_blank">bgranger@calpoly.edu</a><br>

<a href="mailto:ellisonbg@gmail.com" target="_blank">ellisonbg@gmail.com</a><br>

</div></div></blockquote></div></div><br>

</blockquote></div><br><br clear="all"><br>-- <br>Brian E. Granger, Ph.D.<br>Assistant Professor of Physics<br>Cal Poly State University, San Luis Obispo<br><a href="mailto:bgranger@calpoly.edu">bgranger@calpoly.edu</a><br>

<a href="mailto:ellisonbg@gmail.com">ellisonbg@gmail.com</a><br>