Re: [Twisted-Python] Re: Handling PBConnectionLost errors

30 Jul 2007

      David,

You have gone above an beyond my expectations to answer my questions.  
Thank you.

On Jul 28, 2007, at 1:07 AM, David Bolen wrote:
...
Daniel Miller  writes:
...
Is this such a stupid question that it doesn't even warrant a  
response?
~ Daniel
I agree with the other comment to the effect that the lack of response
may be more due to the underlying complexity of the question as to
lack of interest. ...
It's funny, my question was complex, but it nevertheless contained  
too many assumptions about my application and environment to allow  
you to answer easily. Thanks for taking a stab at it anyway.
...
For example, your opening point about:
...
...
(...)                                 It
would be nice to implement a fail-safe(er) way of calling remote
methods that would retry when necessary until the remote method has
been called successfully and the result has been returned.  (...)
has an implicit assumption that the remote method will even continue
to exist once the disconnect has occurred - something that is by no
means guaranteed with PB.
I hadn't even thought of that, although now that you point it out  
it's obvious. My (server-side) application is just a singleton facade  
to an accounting system database. I'm posting orders from an order  
entry system to invoices in the accounting system. The server- 
supplied "referenceable" will always be available assuming something  
terrible has not happened to the server (e.g. crashed, hacked or  
physically damaged--none of which are things I'm trying to solve here).
...
Perhaps some earlier messages of mine when we had just finished
putting together the remote wrapping and reconnect support in our
system.  See my responses to the thread at:
http://twistedmatrix.com/pipermail/twisted-python/2005-July/ 
011030.html
and
http://twistedmatrix.com/pipermail/twisted-python/2005-July/ 
011046.html
Thanks I'll take a look at them.
...
It hits on topics beyond that of just a reliable method call, but the
second message more specifically talks about the wrapper that
implements reconnections, and how we dealt with updating references
post-reconnect.  You can probably see how the design dovetailed with
our particular server side structure (the registry was persistent as
were the managers, so they provided the concrete point of
reattachment).  And the use of the wrappers around references meant we
could "correct" the wrappers for a new connection without having to
worry about what parts of the client application may have been holding
references.  Perhaps it will give you some other ideas in your own
system.
This sounds good, I think I have a similar enough setup that I will  
be able to at least gain some good ideas.
...
For your other points:
...
...
I have two questions:
1. Does something like this already exist?
<snip>
... I'm not aware of any existing approach that is generally suitable
for any application.  I rather doubt any single generic approach would
be possible, since PB provides for many mechanisms of statement
management and referenceability among servers and clients.
You're probably right, although the problem domain is interesting  
enough to me that I may try to see what I can do if I ever get enough  
time :)
...
...
...
2. Is this a totally stupid idea? (would it be better to improve
our physical network than to try to band-aid the problem with
something like this?)
It's never a stupid idea to engineer for network interruptions, but
like everything else a design must weigh benefits against
cost/development.  With that said, it might not be a bad idea to also
look into your network.  TCP connections are rather hard to break just
due to network transmission problems, and all your PB calls are going
across a single TCP session.  They might be significantly delayed on a
bad network, but the connection itself shouldn't fail unless something
more extreme (and unusual) is happening.  Given the level of problems
you're encountering, I wouldn't be surprised if something else was
awry.
That's what I thought (the connections shouldn't just be dropping for  
no apparent reason, especially since they are all within the bounds  
of a LAN). I know this is getting off topic, but I thought maybe  
you'd know: collisions on the hub should be handled by TCP, and my  
application should not have to worry about them, correct? Even that  
doesn't answer why there are dropped connections on the switched side  
of the network. Maybe we have some bad wiring? FWIW, I am planning to  
eliminate the hub in lieu of another switch (there are other problems  
as well).

Again, thanks very much for your well-thought-out response.

~ Daniel