State and web2 (or, how to not follow REST)
This is an attempt to summarize a conversation I had /w glyph on #twisted.web earlier today. I've attached the IRC log. The basic problem discussed was how to manage server-side state; which in particular includes sessions and authentication. Stateful servers cause serious problems with scalability and with bug hunting. Glyph is adamant that getSession() as it currently works (by raising a redirect exception if the session does not already exist) must be fixed. I absolutely agree with this; as it currently stands the top-level resource must always ask for a Session object to avoid unexpected redirects down the request stream. This shouldn't be a lesson of experience; it should be built-in convention. A related issue, one to which Glyph is very concerned about, is the implicit coupling of resources during the handling of a given request. In particular, where a resource X (located at /foo) sets a variable V in the request R and then a resource Y (at /foo/bar) comes to depend upon this variable V. If this coupling is not made explicit and checked-for, then an opportunity for rather obscure bugs emerge; one where the resource Y is re-used in another context (say at /bing/bar) and still assumes that V is set. While Twisted framework cannot prevent such nonsense, it should propose an alternative mechanism, or at the very least not promote such dynamic resource dependencies. One way to make resource dependencies explicit is to require that the constructor for a child resource take an optional ancestor resource in its constructor. In this model, each user/session would in essence have its own top-level resource, and all resources which dependended upon session state would take in its constructor the parent resource. This approach has a few deficiencies: (a) there might be more than one instance of a resource Y at /foo/bar, one for each user; this is not only inefficient but makes debugging hard beause the relation of a URI onto a resource object is not a relation; (b) while leaf resources, such as a static.File object need not take a parent resource in its constructor; it forces generic Resources to have a "pass-through" parent Resource, even if it does not need state information. In the IRC conversation, I believe (and hope) this was proposed and then eventually rejected; but I'm not sure. I don't like the idea of any Resource objects in the system being user or session specific. Another alternative is to add a getSiteResource() and setSiteResource() to the Request interface. The SiteResource would then contain the top level resource "/" which reflected the user's Session and any other application specific server side state (ie, nasty persistent global-like variables which breaks REST). The SiteResource would therefore be an appliation specific object; it could, for example contain an session-id and a username property for down-stream Resource authorization. Later in the IRC chat, Glyph said he is "coming around to the fact that it's not really a resource". This is good; beacuse I don't think that this server side state is a resource by the definition of web architecture. I didn't mention it in the IRC chat, but I'm now thinking that these methods on the request object could be setState() and getState(); and that they return an arbitrary application-defined object which has all of the nastly (but unfortunately mandatory and pratical) session and request "global variables" that break REST and can cause all sorts of problems. Glyph mentioned earlier in an email that perhaps a declarative syntax could be introduced so that arbitrary Resources could advertise exactly what "state" they will access; and hence, these sorts of errors could be detected and reported more intelligently. I like this idea; it is framework support to prompt developers to put in the assertion checks that they should already be doing. It codifies a solid pratice, and this is a good thing. ;) I think that Glyph and I did have a clear agreement: all of the information on the State (in Glyph's terms SiteResource) object should be set (and perhaps made read-only?) _before_ any Resource delegation is made. We do have a slight (but very slight) semantic difference. I see the process of setting up a session and creating any server-side state as done in a IRequestHandler _before_ any IResources are called. Glyph sees this as being done "by a resource which sits at the the top level". In both models, this sort of stuff is done before your average every-day resources are processed; my model is just more explicit and allows for chaining IRequestHandlers (such as one for sessions, and another one for authentiction) before IResources are processed. I think that's about it. I just want a simple solution to this, and soon. Best, Clark
On IRC today, foom pointed out that my idea for distinguishing between things that modify a request (a IRequestHandler) such as setting up a session or authenticating a user, must also be able to happen between IResources. The example given was /~user/bing/my-app/foo, where authentication (and possibly setting up a session) happens: /~user/bing/{HERE}my-app/foo So, goal is to distinguish between Resources which should not be modifying the request's context and RequestFilters which by definition change the request. The glue can be done by having a ContextResource that explicitly applies one or more RequestFilters on the current request. We can then designate a syntax for asserting that the application of a particular IRequestFilter has been done on the request. Anyway, just consider this a random attempt to detail the idea.... class IRequestFilter(Interface): """ I am a filter that is applied to a given request, examples of a RequestFilter are, a "SessionManager", "Authenticator", etc. """ def apply(self, request): """ Perform necessary logic on the request, modifying the request's context as necessary and/or raising HTTPError as needed. Ideally only an IRequestFilter modifies a request's context. """ class IResource: # same as before, adding: require = Attribute( "A sequence of IRequestFilters which must have been applied to a Request before this resource is accessed") class IRequest(Interface): # same as before, adding: filtered = Attribute( "A sequence of IRequestFilters that have been applied.") # since there are two objects that are so commonly used, I feel # that they merit their own "slot" on the Request object: session = Attribute("An optional ISession object added by the ISessionRequestFilter.") avatar = Attribute("An optional IAvatar object added by the IAuthenticateRequestFilter, or other providers of an IAvatar") class ContextResource(Resource): # or SiteResource, or AppResource """ I am a application or site resource which modifies the incoming request via a set of RequestFilters. Only ContextResources should be doing this sort of thing. """ filter = None def registerFilter(self, requestFilter): if not self.filter: self.filter = [requestFilter] return self.filter.append(requestFilter) def locateChild(self, request, segments): for filter in self.filter: filter.apply(request) request.filtered.append(filter) Resource.locateChild(self, request, segments) myapp = ContextResource() myapp.registerFilter(SessionManager()) myapp.registerFilter(HTTPAuthHandler())
participants (1)
-
Clark C. Evans