[Twisted-Python] Serving files, again
![](https://secure.gravatar.com/avatar/9a70492412d82e5afdbc3ab8e67cc574.jpg?s=120&d=mm&r=g)
Hi, i had hacked the rpy below, to serve files (those specified by the left over part of the url, and rooted on a hard-wired rootdir on the system) following the suggestions of this list, picking from the example bits, and salvaging whatever ideas I can from the docs. However the mental model i'd like to form is still very murky... After Clark's post from yesterday, I tried to redo this in the "proper" way, i.e. using putChild. However, the best i could get was "Request did not return a string", when trying putChild and getChild as the content of the render() method below (with isLeaf=0). Would anyone be able to clarify what the render() method should be such that the url http://host/docs/some/file.txt will return the file at system location of: /Path/To/Somewhere/some/file.txt ? Oh, and in general, how does one turn off directory browsing for a twisted.web server? Thanks, mario ############################ # docs.rpy from twisted.protocols import http from twisted.web import resource, error import os ### docsBase = '/Path/To/Somewhere' serveFileTypes = ['','.txt','.pdf','.gif'] ### class RestrictedResource(resource.Resource): def isLeaf(self): return 1 def render(self, request): # a few var used in blocks below subPath = '/'.join(request.postpath) fullPath = docsBase +'/'+ subPath rootPath = '/docs/'+subPath if not len(subPath): rootPath = '/docs'+subPath # otherwise get double slash (when postpath is zero length) dirlist = [] # build list or return file try: if not os.path.exists(fullPath): raise Exception # of type... elif os.path.isdir(fullPath): dirlist = processDirlist(os.listdir(fullPath)) elif os.path.isfile(fullPath): import mimetypes mimetype = mimetypes.guess_type(fullPath)[0] if not mimetype: mimetype = 'text/plain' # fallback request.setHeader("content-type", mimetype) try: f = open(fullPath) return f.read() finally: f.close() else: raise Exception # of type.. except: errpage = error.ErrorPage(http.NOT_FOUND,"Not Found",rootPath) return errpage.render(request) # response string s = '<ul>' for file in dirlist: s += '<li><a href="%s">%s</li>' % ( rootPath+'/'+file, file ) s += '</ul>' title = 'Directory listing for ' + rootPath return '''<html><head><title>%s</title></head><body><h1>%s</h1>%s</body></ html>''' % (title,title,s) ### resource = RestrictedResource() ### def processDirlist(dirlist): dl = [] for filename in dirlist: (name, suffix) = os.path.splitext(filename) if suffix in serveFileTypes: dl.append(filename) return dl ###
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 12:48:31AM +0100, Mario Ruggier wrote: | Would anyone be able to clarify what the render() method should be | such that the url http://host/docs/some/file.txt will return the file | at system location of: /Path/To/Somewhere/some/file.txt ? I don't know if this is your question; but I'd return the contents of file.txt when the file size is less than a pre-defined size, say 32K. Beyond that, the file should be DEFERRED into a thread which sends the file a chunk at a time. Further, it'd be cool if I could specify in the request object if the response should be compressed via gzip. This brings up another clarification of Resource, it'd be nice if there was a sub-class of Resource called a "FilterResource" which basically didn't serve content but which perhaps consumed path segments or arguments or cookies and perhaps altered the request object. A FilterResource would have an additional method called "initRequest" which could add stuff like my pathargs variable, and in the above request could add a flag to the request object specifying if the file content should be compressed. Clark
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 07:20:02PM +0000, Clark C. Evans wrote: | On Wed, Feb 26, 2003 at 12:48:31AM +0100, Mario Ruggier wrote: | | Would anyone be able to clarify what the render() method should be | | such that the url http://host/docs/some/file.txt will return the file | | at system location of: /Path/To/Somewhere/some/file.txt ? | | I don't know if this is your question; but I'd return the contents | of file.txt when the file size is less than a pre-defined size, say 32K. | Beyond that, the file should be DEFERRED into a thread which sends | the file a chunk at a time. Further, it'd be cool if I could | specify in the request object if the response should be compressed | via gzip. Actually, shouldn't it always be deferred? The file system may be writing at the time (via another thread) so doing a file read (even for a tiny file) at this point could cause significant lag or am I completely mis-understanding this whole async-stuff? Clark
![](https://secure.gravatar.com/avatar/56e4cc78ea7fcf3bb37888ebf23bc1f0.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 07:51:35PM +0000, Clark C. Evans wrote:
We usually consider IO on local fixed disks to be fast enough. In any case, select() in POSIX tells you that files are always ready for reading, so being smarter about it requires using a different mechanism (which is entirely possible, but requires a different reactor, not to mention platform support). BTW, deferring to a thread would not be the way to go. Something similar to twisted.spread.util.Pager would probably be appropriate, or maybe something that implements IProducer. Or maybe just a chain of Deferreds :) No need to go into threads for this, though. Jp -- "I quite agree with you," said the Duchess; "and the moral of that is -- Be what you would seem to be' -- or, if you'd like it put more simply -- Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise.'" -- Lewis Carrol, "Alice in Wonderland" -- up 18 days, 0:29, 5 users, load average: 0.47, 0.28, 0.14
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
Thanks Jp, this is helpful. On Wed, Feb 26, 2003 at 02:50:31PM -0500, Jp Calderone wrote: | We usually consider IO on local fixed disks to be fast enough. In any | case, select() in POSIX tells you that files are always ready for reading, | so being smarter about it requires using a different mechanism (which is | entirely possible, but requires a different reactor, not to mention platform | support). The files that I need to serve up are quite big (some are a meg or more), and it would be bad to block other resources while the file loads into memory via file.read() or for the time it takes for the client to completely consume the file. | BTW, deferring to a thread would not be the way to go. Something similar | to twisted.spread.util.Pager would probably be appropriate, or maybe | something that implements IProducer. Or maybe just a chain of Deferreds :) | No need to go into threads for this, though. Ok. So this would be the equivalent of a "file generator" which returns its content in say 4K chunks? This would work by returning a callback which (a) wrote out 4K and then (b) deferred itself again? class deferredreader: def __init__(self,filename,chunksize = 4096): self.filename = filename self.file = None self.chunksize = 4096 def callback(self,req): if not self.file: self.file = open(filename,"r") return DEFERRED chunk = self.file.read(self.chunksize) if chunk: req.write(chunk) return DEFERRED else: return "" (written but not tested) Is this the Jist of it? It still has the problem that file.read is a blocking call; I suppose for unix platforms you could use "poll()" to not block. This is probably resonable; on the server side you don't block, while for desktop windows clients it blocks. Is this what you were thinking with the chain of deferreds? Best, Clark
![](https://secure.gravatar.com/avatar/0b90087ed4aef703541f1cafdb4b49a1.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 09:06:57PM +0000, Clark C. Evans wrote:
If file loading is too slow, buy some more memory. Keeping hundreds of megs of files in RAM is standard procedure for any sane operating system these days. Let it worry about keeping the file access fast. It should (and AIUI will) be served to the client chunk-by-chunk, processing other tasks between the reads.
(sorry for responding to doublequoted text) Oh, python threads may not the low-level enough to actually help with disk IO (on Linux, atleast). Don't know if they are or are not.. Avoiding blocking on disk IO needs a separate process context in the kernel, userspace threading will not help.
poll() or select() won't work in file access, files block always unless you use AIO or something like that. Sorry. If you are that worried about performance, type "c10k" into google and start writing C. Nothing else will really help; file access only becomes a bottleneck _after_ you've done all the other things suggested at c10k. -- :(){ :|:&};:
![](https://secure.gravatar.com/avatar/3a7e70f3ef2ad1539da42afc85c8d09d.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 02:50:31PM -0500, Jp Calderone wrote:
static.File already reads/writes in chunks, and it uses Producers. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | Release Manager, Twisted Project ---------+ http://twistedmatrix.com/users/radix.twistd/
![](https://secure.gravatar.com/avatar/b3407ff6ccd34c6e7c7a9fdcfba67a45.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 07:20:02PM +0000, Clark C. Evans wrote:
Probably what should happen is that Twisted Web should look for the Content-Transfer-Encoding header (or whatever it's called), and automatically gzip if appropriate. The response should probably be able to indicate that gzipping would a waste of time, though -- there's not much point to gzipping jpegs, or .gz files... -Andrew.
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 12:48:31AM +0100, Mario Ruggier wrote: | | Oh, and in general, how does one turn off directory browsing | for a twisted.web server? You could do this in the constructor, path = RestrictedResource('/my/path',directoryBrowsing=false) But, this really should be done on a per-request mechanism. Perhaps request needs a "options" collection where a top-level Resource can set processing options for lower-level resources in the path chain (see the FilterResource post previously). Anyway, I wanted to respond to your code below... | class RestrictedResource(resource.Resource): | def isLeaf(self): | return 1 | def render(self, request): ... | fullPath = docsBase +'/'+ subPath | try: | if not os.path.exists(fullPath): | raise Exception # of type... | elif os.path.isdir(fullPath): | dirlist = processDirlist(os.listdir(fullPath)) | elif os.path.isfile(fullPath): | import mimetypes This is interesting. I would have probably done it a less efficient way (but perhaps more flexible)? I would have used two resources, a DirectoryResource and a FileResource. The DirectoryResource would override getChild(path,request) and dynamically look for a child in the current path, leveraging the descent operation in getChildForRequest. This object would then either return the subordinate NotFoundResource, DirectoryResource or a FileResource object depending on what the path matched. The constructor for these child resources would have a fullpath, constructed by concatinating the fullpath of the current Directory with the given path. The FileResource would serve up the given file, by overriding the render(request) method as you have specified above. In this way one could provide a replacement FileResource or override a DirectoryResource, etc. If you wanted to get tricky, you could "stuff" the path state into the request object and the DirectoryResource could 'return self' instead of creating intermediary Directories. This leads to the following more general questions: 1. There should be a general way to attach "resource specific" data to a given request, for DirectoryResource it'd be the current path, for PathArgs, it'd be the mapping of path arguments to variables. 2. This mechanism could thus be used for inter-resource communication, say where a UserSession(FilterResource) would attach a directive to "compress_files" or not to subordinate FileResources. As I said in a previous post, I'm quite impressed with the whole "Resource" concept and the "tail recursive" descent mechanism provided via getChildForRequest. Hope this helps... Clark
![](https://secure.gravatar.com/avatar/9a70492412d82e5afdbc3ab8e67cc574.jpg?s=120&d=mm&r=g)
Thanks, I have played with your suggestions somewhat. But, I want to avoid doing things such as overriding File's render() as I had initially (File.render() does too many nice things to throw away so easily, as pointed out in the other posts). A source of confusion for me is knowing which, and when, specific methods are called automatically. Particularly, it would be nice to have a clarification (in the API docs) of when the methods getChildForRequest() and getChildWithDefault() are called -- they seem not be called in a non-siteroot resource. Things worked well with PathArgs, it being set as root resource, but for an arbitrary resource, like the example I previoulsy included, the game seems to change . But, indeed, as Clark, I do find this model of cascading requests very intriguing, and tantalizingly powerful... Cheers, mario
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
First, before I get started, let me just say that I think that the resource delegation mechanism in this library is just brilliant in its simplicity and operation. However, after much musing, I've decided that getChildWithDefault isn't very useful and kinda mucks up the waters: # public interface def getChild(self, path, request): return error.NoResource("No such child resource.") # private interface def getChildWithDefault(self, path, request): if self.children.has_key(path): return self.children[path] return self.getChild(path, request) def getChildForRequest(self, request): res = self while request.postpath and not res.isLeaf: pathElement = request.postpath.pop(0) request.acqpath.append(pathElement) request.prepath.append(pathElement) res = res.getChildWithDefault(pathElement, request) return res Suggested refactor: # module variables resourceNotFound = error.NoResource('No such child resource.') # public interface def getChild(self, path, request): if self.children.has_key(path): return self.children[path] return None # private interface (called on root only) def getChildForRequest(self, request): res = self while request.postpath and not res.isLeaf: pathElement = request.postpath.pop(0) request.acqpath.append(pathElement) request.prepath.append(pathElement) res = res.getChild(pathElement, request) if res is None: return resourceNotFound return res Rationale: 1. It is very useful to have a *public* interface function which is _always_ called for every request. In this manner, an application can implement request modifiers/filters. Currently the function that satisfies this need, getChildWithDefault is private. 2. Unless you break the public interface, the current mechanism always searches children first without a hook for the application. This isn't always desireable. For example, a 'security' FilterResource may want to check user access before descending down a given resource sub-tree. Yes, you could implement this security as part of each resource (by inheriting); but I feel that this is inferior to haveing a more "component" based solution where the security filter is injected into the resource tree. 3. From a object-oriented perspective, getChildWithDefault actually does the 'default' behavior that people may want to inherit and discard, and thus this default searching code should go into getChild instead; the user can then decide how to best use this default behavior. 4. getChild's current interface, always returning a resource, albeit a not-very-useful resource limits possible innovative combinations of intra-resource delegation and cooperation. It should intead return a None value which can be tested for... Impact on change: Anyone who wrote a previous resource who dependend on the set of children being searched *before* getChild is called would break. I think that this is probably a pretty rare event; but it is a clean break, and the fix is simple... class MyResource(Resource): def getChild(self,path,request): res = Resource.getChild(self,path,request) if res is None: // try to create a dynamic resource return None Alternatively, if they wanted to search the dynamic resources first, they could code it this way: class MyResource(Resource): def getChild(self,path,resource): res = None // try to create dynamic resource if res is not None: return res return Resource.getChild(self,path,request) Perhaps a few examples would have to be changed, but most likely the above impact is in only a few select resources. Alternative refactor: The simplest alternative is to add getChildWithDefault to the public interface and document the mechanism. It think that this, in the long run is not as good as the proposed refactor since it adds extra complexity for the "search children first or last" behavior choice. It's just clunky the way it is, IMHO. In any case, the Resource finding mechanism in Twisted is very clever, and I'm using my PathArgs *alot* so I'd like a solution so that my requirements don't require a breaking the public interface. Ohh, and to answer: On Thu, Feb 27, 2003 at 05:51:02PM +0100, Mario Ruggier wrote: | A source of confusion for me is knowing which, | and when, specific methods are called automatically. | Particularly, it would be nice to have a clarification (in the | API docs) of when the methods getChildForRequest() and | getChildWithDefault() are called -- they seem not be called in | a non-siteroot resource. Things worked well with PathArgs, it | being set as root resource, but for an arbitrary resource, | like the example I previoulsy included, the game seems to | change . getChildWithDefault is infact called on non-siteroot resources, so the current code for PathArgs will work at any level... albeit a violation of the private/public encapsulation. But yes, the overall mechanism (by having a 3rd wheel) is less than ideal. Best, Clark
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 12:48:31AM +0100, Mario Ruggier wrote: | Would anyone be able to clarify what the render() method should be | such that the url http://host/docs/some/file.txt will return the file | at system location of: /Path/To/Somewhere/some/file.txt ? I don't know if this is your question; but I'd return the contents of file.txt when the file size is less than a pre-defined size, say 32K. Beyond that, the file should be DEFERRED into a thread which sends the file a chunk at a time. Further, it'd be cool if I could specify in the request object if the response should be compressed via gzip. This brings up another clarification of Resource, it'd be nice if there was a sub-class of Resource called a "FilterResource" which basically didn't serve content but which perhaps consumed path segments or arguments or cookies and perhaps altered the request object. A FilterResource would have an additional method called "initRequest" which could add stuff like my pathargs variable, and in the above request could add a flag to the request object specifying if the file content should be compressed. Clark
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 07:20:02PM +0000, Clark C. Evans wrote: | On Wed, Feb 26, 2003 at 12:48:31AM +0100, Mario Ruggier wrote: | | Would anyone be able to clarify what the render() method should be | | such that the url http://host/docs/some/file.txt will return the file | | at system location of: /Path/To/Somewhere/some/file.txt ? | | I don't know if this is your question; but I'd return the contents | of file.txt when the file size is less than a pre-defined size, say 32K. | Beyond that, the file should be DEFERRED into a thread which sends | the file a chunk at a time. Further, it'd be cool if I could | specify in the request object if the response should be compressed | via gzip. Actually, shouldn't it always be deferred? The file system may be writing at the time (via another thread) so doing a file read (even for a tiny file) at this point could cause significant lag or am I completely mis-understanding this whole async-stuff? Clark
![](https://secure.gravatar.com/avatar/56e4cc78ea7fcf3bb37888ebf23bc1f0.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 07:51:35PM +0000, Clark C. Evans wrote:
We usually consider IO on local fixed disks to be fast enough. In any case, select() in POSIX tells you that files are always ready for reading, so being smarter about it requires using a different mechanism (which is entirely possible, but requires a different reactor, not to mention platform support). BTW, deferring to a thread would not be the way to go. Something similar to twisted.spread.util.Pager would probably be appropriate, or maybe something that implements IProducer. Or maybe just a chain of Deferreds :) No need to go into threads for this, though. Jp -- "I quite agree with you," said the Duchess; "and the moral of that is -- Be what you would seem to be' -- or, if you'd like it put more simply -- Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise.'" -- Lewis Carrol, "Alice in Wonderland" -- up 18 days, 0:29, 5 users, load average: 0.47, 0.28, 0.14
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
Thanks Jp, this is helpful. On Wed, Feb 26, 2003 at 02:50:31PM -0500, Jp Calderone wrote: | We usually consider IO on local fixed disks to be fast enough. In any | case, select() in POSIX tells you that files are always ready for reading, | so being smarter about it requires using a different mechanism (which is | entirely possible, but requires a different reactor, not to mention platform | support). The files that I need to serve up are quite big (some are a meg or more), and it would be bad to block other resources while the file loads into memory via file.read() or for the time it takes for the client to completely consume the file. | BTW, deferring to a thread would not be the way to go. Something similar | to twisted.spread.util.Pager would probably be appropriate, or maybe | something that implements IProducer. Or maybe just a chain of Deferreds :) | No need to go into threads for this, though. Ok. So this would be the equivalent of a "file generator" which returns its content in say 4K chunks? This would work by returning a callback which (a) wrote out 4K and then (b) deferred itself again? class deferredreader: def __init__(self,filename,chunksize = 4096): self.filename = filename self.file = None self.chunksize = 4096 def callback(self,req): if not self.file: self.file = open(filename,"r") return DEFERRED chunk = self.file.read(self.chunksize) if chunk: req.write(chunk) return DEFERRED else: return "" (written but not tested) Is this the Jist of it? It still has the problem that file.read is a blocking call; I suppose for unix platforms you could use "poll()" to not block. This is probably resonable; on the server side you don't block, while for desktop windows clients it blocks. Is this what you were thinking with the chain of deferreds? Best, Clark
![](https://secure.gravatar.com/avatar/0b90087ed4aef703541f1cafdb4b49a1.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 09:06:57PM +0000, Clark C. Evans wrote:
If file loading is too slow, buy some more memory. Keeping hundreds of megs of files in RAM is standard procedure for any sane operating system these days. Let it worry about keeping the file access fast. It should (and AIUI will) be served to the client chunk-by-chunk, processing other tasks between the reads.
(sorry for responding to doublequoted text) Oh, python threads may not the low-level enough to actually help with disk IO (on Linux, atleast). Don't know if they are or are not.. Avoiding blocking on disk IO needs a separate process context in the kernel, userspace threading will not help.
poll() or select() won't work in file access, files block always unless you use AIO or something like that. Sorry. If you are that worried about performance, type "c10k" into google and start writing C. Nothing else will really help; file access only becomes a bottleneck _after_ you've done all the other things suggested at c10k. -- :(){ :|:&};:
![](https://secure.gravatar.com/avatar/3a7e70f3ef2ad1539da42afc85c8d09d.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 02:50:31PM -0500, Jp Calderone wrote:
static.File already reads/writes in chunks, and it uses Producers. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | Release Manager, Twisted Project ---------+ http://twistedmatrix.com/users/radix.twistd/
![](https://secure.gravatar.com/avatar/b3407ff6ccd34c6e7c7a9fdcfba67a45.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 07:20:02PM +0000, Clark C. Evans wrote:
Probably what should happen is that Twisted Web should look for the Content-Transfer-Encoding header (or whatever it's called), and automatically gzip if appropriate. The response should probably be able to indicate that gzipping would a waste of time, though -- there's not much point to gzipping jpegs, or .gz files... -Andrew.
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
On Wed, Feb 26, 2003 at 12:48:31AM +0100, Mario Ruggier wrote: | | Oh, and in general, how does one turn off directory browsing | for a twisted.web server? You could do this in the constructor, path = RestrictedResource('/my/path',directoryBrowsing=false) But, this really should be done on a per-request mechanism. Perhaps request needs a "options" collection where a top-level Resource can set processing options for lower-level resources in the path chain (see the FilterResource post previously). Anyway, I wanted to respond to your code below... | class RestrictedResource(resource.Resource): | def isLeaf(self): | return 1 | def render(self, request): ... | fullPath = docsBase +'/'+ subPath | try: | if not os.path.exists(fullPath): | raise Exception # of type... | elif os.path.isdir(fullPath): | dirlist = processDirlist(os.listdir(fullPath)) | elif os.path.isfile(fullPath): | import mimetypes This is interesting. I would have probably done it a less efficient way (but perhaps more flexible)? I would have used two resources, a DirectoryResource and a FileResource. The DirectoryResource would override getChild(path,request) and dynamically look for a child in the current path, leveraging the descent operation in getChildForRequest. This object would then either return the subordinate NotFoundResource, DirectoryResource or a FileResource object depending on what the path matched. The constructor for these child resources would have a fullpath, constructed by concatinating the fullpath of the current Directory with the given path. The FileResource would serve up the given file, by overriding the render(request) method as you have specified above. In this way one could provide a replacement FileResource or override a DirectoryResource, etc. If you wanted to get tricky, you could "stuff" the path state into the request object and the DirectoryResource could 'return self' instead of creating intermediary Directories. This leads to the following more general questions: 1. There should be a general way to attach "resource specific" data to a given request, for DirectoryResource it'd be the current path, for PathArgs, it'd be the mapping of path arguments to variables. 2. This mechanism could thus be used for inter-resource communication, say where a UserSession(FilterResource) would attach a directive to "compress_files" or not to subordinate FileResources. As I said in a previous post, I'm quite impressed with the whole "Resource" concept and the "tail recursive" descent mechanism provided via getChildForRequest. Hope this helps... Clark
![](https://secure.gravatar.com/avatar/9a70492412d82e5afdbc3ab8e67cc574.jpg?s=120&d=mm&r=g)
Thanks, I have played with your suggestions somewhat. But, I want to avoid doing things such as overriding File's render() as I had initially (File.render() does too many nice things to throw away so easily, as pointed out in the other posts). A source of confusion for me is knowing which, and when, specific methods are called automatically. Particularly, it would be nice to have a clarification (in the API docs) of when the methods getChildForRequest() and getChildWithDefault() are called -- they seem not be called in a non-siteroot resource. Things worked well with PathArgs, it being set as root resource, but for an arbitrary resource, like the example I previoulsy included, the game seems to change . But, indeed, as Clark, I do find this model of cascading requests very intriguing, and tantalizingly powerful... Cheers, mario
![](https://secure.gravatar.com/avatar/8ca35506ac08cebd833ab53032896c0b.jpg?s=120&d=mm&r=g)
First, before I get started, let me just say that I think that the resource delegation mechanism in this library is just brilliant in its simplicity and operation. However, after much musing, I've decided that getChildWithDefault isn't very useful and kinda mucks up the waters: # public interface def getChild(self, path, request): return error.NoResource("No such child resource.") # private interface def getChildWithDefault(self, path, request): if self.children.has_key(path): return self.children[path] return self.getChild(path, request) def getChildForRequest(self, request): res = self while request.postpath and not res.isLeaf: pathElement = request.postpath.pop(0) request.acqpath.append(pathElement) request.prepath.append(pathElement) res = res.getChildWithDefault(pathElement, request) return res Suggested refactor: # module variables resourceNotFound = error.NoResource('No such child resource.') # public interface def getChild(self, path, request): if self.children.has_key(path): return self.children[path] return None # private interface (called on root only) def getChildForRequest(self, request): res = self while request.postpath and not res.isLeaf: pathElement = request.postpath.pop(0) request.acqpath.append(pathElement) request.prepath.append(pathElement) res = res.getChild(pathElement, request) if res is None: return resourceNotFound return res Rationale: 1. It is very useful to have a *public* interface function which is _always_ called for every request. In this manner, an application can implement request modifiers/filters. Currently the function that satisfies this need, getChildWithDefault is private. 2. Unless you break the public interface, the current mechanism always searches children first without a hook for the application. This isn't always desireable. For example, a 'security' FilterResource may want to check user access before descending down a given resource sub-tree. Yes, you could implement this security as part of each resource (by inheriting); but I feel that this is inferior to haveing a more "component" based solution where the security filter is injected into the resource tree. 3. From a object-oriented perspective, getChildWithDefault actually does the 'default' behavior that people may want to inherit and discard, and thus this default searching code should go into getChild instead; the user can then decide how to best use this default behavior. 4. getChild's current interface, always returning a resource, albeit a not-very-useful resource limits possible innovative combinations of intra-resource delegation and cooperation. It should intead return a None value which can be tested for... Impact on change: Anyone who wrote a previous resource who dependend on the set of children being searched *before* getChild is called would break. I think that this is probably a pretty rare event; but it is a clean break, and the fix is simple... class MyResource(Resource): def getChild(self,path,request): res = Resource.getChild(self,path,request) if res is None: // try to create a dynamic resource return None Alternatively, if they wanted to search the dynamic resources first, they could code it this way: class MyResource(Resource): def getChild(self,path,resource): res = None // try to create dynamic resource if res is not None: return res return Resource.getChild(self,path,request) Perhaps a few examples would have to be changed, but most likely the above impact is in only a few select resources. Alternative refactor: The simplest alternative is to add getChildWithDefault to the public interface and document the mechanism. It think that this, in the long run is not as good as the proposed refactor since it adds extra complexity for the "search children first or last" behavior choice. It's just clunky the way it is, IMHO. In any case, the Resource finding mechanism in Twisted is very clever, and I'm using my PathArgs *alot* so I'd like a solution so that my requirements don't require a breaking the public interface. Ohh, and to answer: On Thu, Feb 27, 2003 at 05:51:02PM +0100, Mario Ruggier wrote: | A source of confusion for me is knowing which, | and when, specific methods are called automatically. | Particularly, it would be nice to have a clarification (in the | API docs) of when the methods getChildForRequest() and | getChildWithDefault() are called -- they seem not be called in | a non-siteroot resource. Things worked well with PathArgs, it | being set as root resource, but for an arbitrary resource, | like the example I previoulsy included, the game seems to | change . getChildWithDefault is infact called on non-siteroot resources, so the current code for PathArgs will work at any level... albeit a violation of the private/public encapsulation. But yes, the overall mechanism (by having a 3rd wheel) is less than ideal. Best, Clark
participants (6)
-
Andrew Bennetts
-
Christopher Armstrong
-
Clark C. Evans
-
Jp Calderone
-
Mario Ruggier
-
Tommi Virtanen