Antony Kummel
1. Synchronization of events and state/data changes: I want clients to be able to receive notifications from a server when certain events occur. When such events are related to changes in state or data of the server that are also accessible to the user, I want the client's interface for retrieving the data and getting notified of the event to be coherent.
(...)
The solution I came up with is to always expose associated events and data using a single cacheable object, so that when there is a change in state/data that should also make an event fire, the data change is propagated to the remote cache, and the remote cache fires an event locally (at the client's side). This way, the client's representation of the server's state is always coherent.
This seems reasonable to me, or at least as close as you can come to your requirement of keeping a remote client's data/events in sync. You would need to ensure that your state object managed changes to itself such that it only reflected changes down the PB channel to any cacheable observers at appropriate times when its own state was consistent. E.g., you wouldn't necessarily update on each field or attribute change. What you're effectively doing here is using the cacheable object as your transaction manager, controlling what clients see. Note that to ensure consistency between clients and server, you'd want to have everyone working from cacheables of this object and nobody (even on the server itself) working from the object itself. Otherwise, you'd need a mechanism (such as with point 2) You still have an issue of network outages preventing updates from making it to clients, but in that case you'd only have to deal with a disconnected client being out of date with respect to the server and other clients, but in a consistent way, and upon reconnecting it would receive a new, but again consistent, set of state information.
2. Synchronization of remote commands and remote events/data changes: I want clients to be able to issue commands to a server that make its state or associated data change, and also to have an up-to-date representation of the state or data (and possible events issued by changes in the state/data). The problem is how to synchronize the firing of the deferred returned by the remote method call with the change in the client's representation of the server's state/data (i.e. to make sure that when the deferred fires, the client's representation of the state is coherent with the command, i.e. changed).
(...)
Presently I have no solution for this, but I'm pretty sure that it requires some combination of referenceable and cacheable that will insure one coherent interface to associated data, events and commands. I am thinking to achieve this with a copyable that will contain both a referenceable and a cacheable.
I think you're going to have an uphill battle here to try to synchronize what is inherently distributed and asynchronous behavior. If I absolutely had to do this, it's probably along the lines of what your considering - I'd consider having a transaction object that encapsulated change requests and state updates under a single umbrella. Rather than a simple cacheable though, you'd probably need to implement a two-phase commit protocol so that the state only changed on both client/server or not at all. If you didn't do that, you'd leave yourself open to the operation occurring on the server, but the network preventing the new state information from getting down to the client, and the client not knowing what state the server was in. But I'd much rather just assume that the client needed to work properly with the state as it currently knew, and respond properly to any state updates as they occurred, whether due to its own operation or some other client's operation. E.g., formally decouple the request to perform an operation, from the state changes that would occur as that operation was performed. In other words, try to stick more to a model/controller approach, where the model state is monitored by clients and they simply react to its changes, but actions taken always flow through a distinct controller path that is associated with but decoupled from the model.
3. Wrapping remote objects for non-online representation. I want to be able to have objects that can be used locally, but may also provide an interface to one or more servers that have to do with the state of the object. Another reason for these wrappers is that I want to be able to pass them freely from process to process, without being dependent on the connection or on a specific server.
(...)
The question is, how to represent this to the user of the object. Whether to allow access to cached remote data and subscription to events when not online, whether to return a deferred and try to connect in case we don't have the data or throw an exception, etc. I suppose this is mostly a matter of style, but if anyone has done something like this before maybe they would have some insight
What we've done in a system of ours that is designed to be an distributed data system is treat the core data objects in the system as pure state objects as much as possible, generally falling into two classes: * Pure instance data (data objects) of which multiple copies may exist simultaneously throughout the system, but which operate locally (Copyable in twisted). The only methods such objects have are to manipulate the local representation (also handled by direct attribute access). * Shared state objects (model objects) of which multiple observers throughout the system may exist (Cacheable in twisted). While these objects may have similar method and attribute access as the above instance data, they are only ever manipulated by controller objects, for which the original instance is only the same "node" as the original instance of the state object. Any client needing to make changes must use a reference to that controller (a twisted Referenceable) and not the model itself, even on the local node where the model/controller are instantiated. The choice between the two object types is not a hard and fast rule - we've tended to use more of the former so far. We then constructed a framework of "manager" objects which are designed to provide access and manipulation of the above data objects. The key attribute of a manager object is that it is both referenceable (twisted Referenceable) and all of its methods are deferrable interfaces - even if used locally. Although I found I had a desire to try to somehow just always pass object references (referenceable) around to everything and let PB handle everything transparently - in practical terms I didn't find that workable. There are just too many nooks and crannies you can get into as things get distributed that taking some more explicit control became necessary to ensure robustness. And you just can't always assume that making changes to state on what is ostensibly a shared object will magically get reflected everywhere reliably. But that's actually where we found PB to be just at the right level as it was easy enough to wrap to behave how we wanted. We provided an extra layer of wrapping for the networking, both for the basic connection as well as our referenceables. A Client object encapsulates making a connection to a server, along with reconnecting as necessary and generating local signals (we use pydispatcher) on connection state changes. A matching Server object provides client access to local managers - through a simple Registry object - upon a client's connection. A general purpose wrapper object is used to wrap each manager referenceable so that they appear to be local (it uses the manager's interface definition to automatically translate method calls into callRemote), as well as to automatically re-establish contact with the remote manager if needed, by listening for the Client signals and on a reconnect, re-obtaining the remote Registry handle and refetching remote references to the manager it had previously wrapped. The wrapping may be multi-layer. This allows us for example, to have a remote site have a master server, which maintains the client link to a central server. That site server, therefore, has what it considers a local Registry with a whole set of local managers - all of which are technically wrappers around the twisted referenceable to the main central server. But then other machines at the site themselves become clients of the site server, with their own references to the site server's references. So when the other site machines make requests they flow to the site server, and then up to the central server and back down. But the same source code works whether running on the central server, or at any level of the site servers, without knowing the difference (since all registry and manager interfaces are deferrable anyway). While the wrapper isolates users from need to worry about reconnecting, we don't attempt to hide the fact that an outage is occurring. Attempts to make a call on a wrapped manager during an outage generates normal PB exceptions, with one change that we modified Twisted to always return exceptions up the deferred change (even for a dead reference) so clients wouldn't have to deal with both local exceptions and errbacks. In practice, the client applications generally have some application level object that is also listening to the Client objects connection signals, and either blocking access to the user when the network is (with an appropriate message), or adjusting behavior accordingly. So, in operation, client code works something like: * Instantiate a Client object, give it connection info and start the connection. Request the registry from the client object (which is deferrable and only fires once the overall connection cycle is complete). * Using the Registry object (which itself is a remote wrapper version on the client side), query references for any manager objects needed. * Using the manager objects, retrieve any data objects needed. Changes to model objects occur through their controllers, while changes to data objects are performed locally and changes updated via explicit "save" calls to the managers. The last point is where we run into similar issues as yourself, I think. By choosing this route we do not provide for other clients of that same data object to automatically see changes made by other clients. They would continue to run with the copy they had previously received, although any subsequent retrieval would get a new copy with the new data. To handle crossing state changes, the originator (actual manager object on whatever node it exists on) of the data object maintains an internal tag (we use a UUID but it could also be a hash of the contents) in the object representing its unique state and will raise a SaveConflict exception of our own if someone else attempts to store changes to an outdated copy. It is up to clients to handle such issues, should they occur (typically by requerying the information and then re-applying their changes), although in practice we really don't have scenarios where this happens yet due to typical usage patterns. Some of this could change if we moved a data object to a model object, but then we're requiring that even simple users of the data object maintain a remote cacheable reference to the object, which is relatively heavyweight. Thus my comment about it being a grey area above as to which sort of object we decide to place such state in. In our environment, our User object (which contains identifying and control information about users) is just a data object, as the need for simultaneous manipulation and monitoring of it is reasonably low. We do expect to have many copies of it around, but mostly on a read-only basis. In your context, I would think that the user object itself need not be something that constantly updates, but the state of which users were currently online would fit better as a model (and the controller to feed it would have methods for a given user to go online or offline). In our structure, we would separate out the concept of generating a system message - probably into a messaging manager - which would then receive requests to transmit messages to identified users. But I don't think I'd try to tie those three things (current User object contents, currently online user set, generating a message) into any sort of guaranteed state ... I'd leave them very loosely coupled. On the issue of distribute events, that's the area we're currently working on, and to us the hardest part is how to handle events that may be generated during outages for which the disconnected clients have subscriptions. If it's just for changes to state objects (such as cacheables) that's not so bad, since the reconnection process will re-query the current state information. But if it's for more general notifications (we might have our own bit for "user updated" like your "user came online") you have a question of how long do you queue up such events for clients that might never show up again. Currently we are targetting such events being handled by a signal or event manager, which will maintain an ongoing history of such events. Subscribers to the event manager will get copies of appropriate events. When a client connects it's local wrapper for the remote event manager will actually handle all local subscriptions, maintaining a single remote set of subscriptions to minimize network I/O. It will also track the delivery of any events. Upon being disconnected/reconnecting (per the standard mechanisms), the client event manager wrapper will request any signals that may have been generated since the last event seen prior to the disconnect. We'll have to bound this somehow for prolonged outages. But a key point is still decoupling the event handling from other operations, and we won't be trying to force everything to stay in sync with other clients and/or servers at all times. If you've put up with me until here, I hope that this at least gives you some other approaches to think about, even if some or all of it isn't directly applicable to your problem domain. -- David