
David,
You have gone above an beyond my expectations to answer my questions. Thank you.
On Jul 28, 2007, at 1:07 AM, David Bolen wrote:
Daniel Miller daniel@keystonewood.com writes:
Is this such a stupid question that it doesn't even warrant a response?
~ Daniel
I agree with the other comment to the effect that the lack of response may be more due to the underlying complexity of the question as to lack of interest. ...
It's funny, my question was complex, but it nevertheless contained too many assumptions about my application and environment to allow you to answer easily. Thanks for taking a stab at it anyway.
For example, your opening point about:
(...) It
would be nice to implement a fail-safe(er) way of calling remote methods that would retry when necessary until the remote method has been called successfully and the result has been returned. (...)
has an implicit assumption that the remote method will even continue to exist once the disconnect has occurred - something that is by no means guaranteed with PB.
I hadn't even thought of that, although now that you point it out it's obvious. My (server-side) application is just a singleton facade to an accounting system database. I'm posting orders from an order entry system to invoices in the accounting system. The server- supplied "referenceable" will always be available assuming something terrible has not happened to the server (e.g. crashed, hacked or physically damaged--none of which are things I'm trying to solve here).
Perhaps some earlier messages of mine when we had just finished putting together the remote wrapping and reconnect support in our system. See my responses to the thread at:
http://twistedmatrix.com/pipermail/twisted-python/2005-July/ 011030.html
and
http://twistedmatrix.com/pipermail/twisted-python/2005-July/ 011046.html
Thanks I'll take a look at them.
It hits on topics beyond that of just a reliable method call, but the second message more specifically talks about the wrapper that implements reconnections, and how we dealt with updating references post-reconnect. You can probably see how the design dovetailed with our particular server side structure (the registry was persistent as were the managers, so they provided the concrete point of reattachment). And the use of the wrappers around references meant we could "correct" the wrappers for a new connection without having to worry about what parts of the client application may have been holding references. Perhaps it will give you some other ideas in your own system.
This sounds good, I think I have a similar enough setup that I will be able to at least gain some good ideas.
For your other points:
I have two questions:
- Does something like this already exist?
<snip>
... I'm not aware of any existing approach that is generally suitable for any application. I rather doubt any single generic approach would be possible, since PB provides for many mechanisms of statement management and referenceability among servers and clients.
You're probably right, although the problem domain is interesting enough to me that I may try to see what I can do if I ever get enough time :)
- Is this a totally stupid idea? (would it be better to improve
our physical network than to try to band-aid the problem with something like this?)
It's never a stupid idea to engineer for network interruptions, but like everything else a design must weigh benefits against cost/development. With that said, it might not be a bad idea to also look into your network. TCP connections are rather hard to break just due to network transmission problems, and all your PB calls are going across a single TCP session. They might be significantly delayed on a bad network, but the connection itself shouldn't fail unless something more extreme (and unusual) is happening. Given the level of problems you're encountering, I wouldn't be surprised if something else was awry.
That's what I thought (the connections shouldn't just be dropping for no apparent reason, especially since they are all within the bounds of a LAN). I know this is getting off topic, but I thought maybe you'd know: collisions on the hub should be handled by TCP, and my application should not have to worry about them, correct? Even that doesn't answer why there are dropped connections on the switched side of the network. Maybe we have some bad wiring? FWIW, I am planning to eliminate the hub in lieu of another switch (there are other problems as well).
Again, thanks very much for your well-thought-out response.
~ Daniel