[Twisted-Python] clients of perspective brokers
I have a question about clients of perspective brokers: I want the processes of my application to be able to connect arbitrarily to other processes in a local network without assuming their existence or location in advance. In the Twisted examples and how-tos, however, the idiom seems to be defining a-priori the other clients and servers in the network (usually before calling reactor.run, or as part of an application�s configuration). My question is, are there any restrictions on calling reactor.connectTCP (or similar) arbitrarily during the execution of the program, is there any cleanup to do when not requiring the other party anymore, etc. Also, it is unclear to me how reconnection works for perspective broker clients. If a connection is dropped by the server, and then a clients tries to make a method call on a remote object, will the client factory try to reconnect before making the request, or will the request fail, and re-connection be attempted the next time, etc.? Thanks, Joe. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Sun, 6 Mar 2005 09:49:11 -0800 (PST), Joachim Boomberschloss
I have a question about clients of perspective brokers: I want the processes of my application to be able to connect arbitrarily to other processes in a local network without assuming their existence or location in advance.
You might find this thread useful: http://twistedmatrix.com/pipermail/twisted-python/2005-February/009579.html Dave Cook
Joachim Boomberschloss
Also, it is unclear to me how reconnection works for perspective broker clients. If a connection is dropped by the server, and then a clients tries to make a method call on a remote object, will the client factory try to reconnect before making the request, or will the request fail, and re-connection be attempted the next time, etc.?
There are no automatic reconnects by default. Additionally once you have lost a connection, all existing references held by the client to objects on the server will be invalid from that point on. There is a general purpose ReconnectingClientFactory in the twisted.internet.protocols module, but it only handles making the basic socket reconnection, and not any higher level re-establishment of protocol communication. There was also some work on a more persistent remote reference scheme (in the sturdy module). The problem with handling reconnects is that PB object references are only good for a particular session (since they match up with broker object dictionaries that are part of the remote protocol instance and go away when the session drops). So even if you re-establish the raw PB connection, none of the object references previously held by the client will be valid any longer. Even the sturdy module only seemed to work for the root object and not other random references held by the application. Back when I was looking to solve the same issue, I didn't really find anything suitable in the twisted code base itself. So it's mostly up to your application to handle these sorts of scenarios. To be honest though, since your application knows the most about how it is using references and objects, it can often have the simplest implementation. For example, in our application we make use of a registry of components, and when distributing the application, the client starts with a remote registry (a Referenceable), and then retrieves remote component references (also Referenceables) for any component they interact with. Pretty much everything else is a normal remote copy (a Copyable rather than Referenceable). So the registry and components provided a great control point to handle network outages. Also, the components whose references are long lived in the client (and who we care about maintaining across an outage) are independent of the remote session - that is they exist independently on the server. So recovery from loss of a network connection is simply re-accessing the prior remote component. That makes handling such outages in a transparent manner fairly straight forward since we can use the original connection information to perform a reconnect without re-involving high level application code. We ended up with three main parts to the recovery system: * A remote registry wrapper that works just like a local registry but automatically wraps references to remote components in a component wrapper. * A remote component wrapper that handles wrapping a remote reference both to control method access (so we can specially handle some methods locally) but also to isolate the application from directly holding onto a PB reference for the remote component object. * Our own PBClientFactory subclass that handles connectivity issues, and automatically wraps a reference to a remote registry (which is obtained through our Root object) in the remote registry wrapper. In addition, we tie them together with various signals (currently using the pyDispatcher package). A client app starts with the client factory, which knows how to connect, reconnect after a failure (with a prescribed retry timing mechanism), periodically ping the remote root object for a live session, and emit signals when the connection goes up or down. The application asks the client factory for the remote registry, and gets back a remote registry wrapper. Since the wrapper operates as a local registry, the application code can work locally or remotely. If the client factory sees the connection drop, once it reconnects, it emits a connection signal which includes the new registry wrapper. The client factory also gives us a good place to perform a series of steps we need to do with the remote root object in order to get access to the remote registry, providing for those operations to complete before giving the registry back to the application either during initial connection (through a waiting deferred) or on a reconnect (via the connection signal). The remote component wrappers (which also include the remote registry wrapper) handle the low level potential for failures. The wrapper handles failures during any PB request (both DeadReferenceError and PBConnectionLost) and in addition to passing up the error, it emits its own signal for a failed request. The client factory listens for such signals, which it uses to initiate an immediate ping test - which in turn can lead to notifying the entire system that the connection is down. We did patch our Twisted so the DeadReferenceError was returned as a deferred rather than raised inline. But once everything centralized around the remote wrappers, technically that became unnecessary because that's the only place (aside from the client factory) that issues the callRemote call, so it's not that hard to handle both the local exception or the deferred error. In the other direction, the wrappers all listen for the client factory's connected signal, and upon receipt, they use the supplied remote registry to re-query the component they wrap (information on which they saved when created) in order to get a new remote reference. Because all of the higher level application code is holding a reference (Python-wise) to the wrapper object and not the PB reference, we can adjust to a new reference inside the wrapper without anything in the application being the wiser or needing to change. Having the network connect/disconnect signals from the client factory also permits any other part of the application to perform certain operations during an outage (so sometimes at our top level UI we'll put up a "temporary outage" message during downtime). While this is fairly specific to our environment, it lets us take an application that is running locally, and with a single change to get its registry from our client factory instead of locally, everything works remotely, including automatic reconnects and re-establishment of all remote object references. Hopefully detailing some of the steps might help you envision how to do something similar for your application. -- David
Thanks for the detailed response! It really helped me
to get a sturdier perspective on everything.
My situation is slightly different than what you
described. My application consists of multiple servers
and clients responsible for the different aspects of
data processing, acquisition and storage (there are,
for example, a database, a configuration server,
arbitrary GUI clients, etc.), and each server exposes
only one referencable object (the root) representing
the server's interface. One consequence, for example,
is that a unit may disappear and re-appear at a
different location (ex. the same user logging in at a
different machine).
I had in mind something very similar to your component
registry: a central server would act as a "unit
directory", containing a listing of units with unique
id's, contact information, and other arbitrary
details. Each unit in the system announces its
existence to the central server every once in a while,
and receives a remote cache of the listing, which gets
updated every time there is a change in the list.
In addition to the local copy of the central unit
directory that every unit holds, it would also hold a
pool of open connections. When a unit wants to contact
another unit, it would query the local copy of the
directory for the desired unit (ex. database, gui unit
of user "bob", etc.), get an address and port number,
and return a wrapper around the remote reference of
the other server's root containing the address
information, called a "friend". If the required
address is already connected to in the local pool of
connections, it is simply directed to use that
connection. When a call is made on the "friend"
object, it attempts to open a connection (if needed,
in which case the new connection is added to the
pool), and makes the remote method call. If it is an
old connection and the call fails, it would attempt to
make a new connection once and repeat the process,
after which the failure (if it persists) is reported
to the caller.
This covers nicely most situations, but with some I am
still struggling:
The main issue is this: it often happens that one unit
in the system wants to be notified when something
happens at another unit. The problems begin when there
are network problems. For example, let's suppose the
unit that wants to be notified sends its root object
as the object that should be kept by the notifying
unit and used as a remote reference to make the
notification. The question is how to handle such
notifications during network problems; for example,
assume that a notification request is made, then the
connection is lost, then network problems subside but
the connection is or is not resumed, and then the
event to notify about happens. Who's responsibility is
it to take care of that? I'm just asking if there is
any standard or common way to approach this, or if I'm
on my own.
Thanks a lot,
Joe.
--- David Bolen
Joachim Boomberschloss
writes: Also, it is unclear to me how reconnection works for perspective broker clients. If a connection is dropped by the server, and then a clients tries to make a method call on a remote object, will the client factory try to reconnect before making the request, or will the request fail, and re-connection be attempted the next time, etc.?
There are no automatic reconnects by default. Additionally once you have lost a connection, all existing references held by the client to objects on the server will be invalid from that point on.
There is a general purpose ReconnectingClientFactory in the twisted.internet.protocols module, but it only handles making the basic socket reconnection, and not any higher level re-establishment of protocol communication. There was also some work on a more persistent remote reference scheme (in the sturdy module).
The problem with handling reconnects is that PB object references are only good for a particular session (since they match up with broker object dictionaries that are part of the remote protocol instance and go away when the session drops). So even if you re-establish the raw PB connection, none of the object references previously held by the client will be valid any longer. Even the sturdy module only seemed to work for the root object and not other random references held by the application. Back when I was looking to solve the same issue, I didn't really find anything suitable in the twisted code base itself.
So it's mostly up to your application to handle these sorts of scenarios. To be honest though, since your application knows the most about how it is using references and objects, it can often have the simplest implementation.
For example, in our application we make use of a registry of components, and when distributing the application, the client starts with a remote registry (a Referenceable), and then retrieves remote component references (also Referenceables) for any component they interact with. Pretty much everything else is a normal remote copy (a Copyable rather than Referenceable). So the registry and components provided a great control point to handle network outages. Also, the components whose references are long lived in the client (and who we care about maintaining across an outage) are independent of the remote session - that is they exist independently on the server. So recovery from loss of a network connection is simply re-accessing the prior remote component. That makes handling such outages in a transparent manner fairly straight forward since we can use the original connection information to perform a reconnect without re-involving high level application code.
We ended up with three main parts to the recovery system:
* A remote registry wrapper that works just like a local registry but automatically wraps references to remote components in a component wrapper. * A remote component wrapper that handles wrapping a remote reference both to control method access (so we can specially handle some methods locally) but also to isolate the application from directly holding onto a PB reference for the remote component object. * Our own PBClientFactory subclass that handles connectivity issues, and automatically wraps a reference to a remote registry (which is obtained through our Root object) in the remote registry wrapper.
In addition, we tie them together with various signals (currently using the pyDispatcher package).
A client app starts with the client factory, which knows how to connect, reconnect after a failure (with a prescribed retry timing mechanism), periodically ping the remote root object for a live session, and emit signals when the connection goes up or down. The application asks the client factory for the remote registry, and gets back a remote registry wrapper. Since the wrapper operates as a local registry, the application code can work locally or remotely. If the client factory sees the connection drop, once it reconnects, it emits a connection signal which includes the new registry wrapper.
The client factory also gives us a good place to perform a series of steps we need to do with the remote root object in order to get access to the remote registry, providing for those operations to complete before giving the registry back to the application either during initial connection (through a waiting deferred) or on a reconnect (via the connection signal).
The remote component wrappers (which also include the remote registry wrapper) handle the low level potential for failures. The wrapper handles failures during any PB request (both DeadReferenceError and PBConnectionLost) and in addition to passing up the error, it emits its own signal for a failed request. The client factory listens for such signals, which it uses to initiate an immediate ping test - which in turn can lead to notifying the entire system that the connection is down.
We did patch our Twisted so the DeadReferenceError was returned as a deferred rather than raised inline. But once everything centralized around the remote wrappers, technically that became unnecessary because that's the only place (aside from the client factory) that issues the callRemote call, so it's not that hard to handle both the local exception or the deferred error.
In the other direction, the wrappers all listen for the client factory's connected signal, and upon receipt, they use the supplied remote registry to re-query the component they wrap (information on which they saved when created) in order to get a new remote reference. Because all of the higher level application code is holding a reference (Python-wise) to the wrapper object and not the PB reference, we can adjust to a new reference inside the wrapper without anything in the application being the wiser or needing to change.
Having the network connect/disconnect signals from the client factory also permits any other part of the application to perform certain operations during an outage (so sometimes at our top level UI we'll put up a "temporary outage" message during downtime).
While this is fairly specific to our environment, it lets === message truncated ===
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
participants (3)
-
Dave Cook
-
David Bolen
-
Joachim Boomberschloss