Serving files from many web-servers thru one central web-server

Hi, I got a subnet full of web-servers using twisted and a main web-server based on twisted as well. The main server is available to the outside world. I want to serve files on the other web-servers on the subnet thru the main web-server. There may be many concurrent users on the main server so it has to be able to handle many clients. A few users connect to the main server, requesting files on the subnet web-servers. The main server reads data from several subnet servers and writes the data back to the requesting clients. How can I do this in twisted, without blocking, and handle several clients? We're not talking hardcore P2P here with thousands of clients, most likely 2-5 concurrent users, 10 at the most. Any hints? Or doesn't this make any sense? -- Mvh/Best regards, Thomas Weholt http://www.weholt.org

Thomas Weholt <thomas.weholt@gmail.com> writes:
A few users connect to the main server, requesting files on the subnet web-servers. The main server reads data from several subnet servers and writes the data back to the requesting clients. How can I do this in twisted, without blocking, and handle several clients? We're not talking hardcore P2P here with thousands of clients, most likely 2-5 concurrent users, 10 at the most.
I haven't had an opportunity to use it myself yet, but there is a twisted.web.spread module that may work since all of your servers are using Twisted. On the subnet servers, wrap your site object in the ResourcePublisher object (a pb.Root subclass) and set that up to listen on an appropriate port. On the main server, for each resource root that you want proxied out to a remote server, insert an appropriate child resource using ResourceSubscription. A PB link used to transmit the requests between the two machines, which means that all of the server render() calls to be proxied are handled in a deferred fashion. I expect there may be a way to interconnect your server into a client HTTP class to proxy to the subset servers using a more traditional web request, but given that you are Twisted throughout, probably no reason not to go ahead and use the PB approach. It does look like these classes still use the older pb.getObjectAt approach for the connection rather than the newer getRootObject, but it should still work. If it doesn't directly suit what you want, it it may at least give you an idea for your own approach (since neither of those classes is overly complex). -- David

Oh, darn!! Forgot all about this when I started my project. I've read thru most of the docs I've found so far, but I'm still somewhat clueless. Can anybody provide a simple example of how to do this, preferrably without using the examples in http://twistedmatrix.com/documents/current/howto/using-twistedweb#auto19. I'm hooking a xmlrpc-handler and a UDP listener into it as well and find the good ol' site = server.Site(MyResource()) reactor.listenTCP(8080, site) reactor.run() way of doing it better than mktap etc. Feel like I got more control doing it the manual way. Anyhow, thanks for your input so far. I just love Twisted !! :-) Best regards, Thomas On 28 Sep 2004 15:25:15 -0400, David Bolen <db3l@fitlinxx.com> wrote:
Thomas Weholt <thomas.weholt@gmail.com> writes:
A few users connect to the main server, requesting files on the subnet web-servers. The main server reads data from several subnet servers and writes the data back to the requesting clients. How can I do this in twisted, without blocking, and handle several clients? We're not talking hardcore P2P here with thousands of clients, most likely 2-5 concurrent users, 10 at the most.
I haven't had an opportunity to use it myself yet, but there is a twisted.web.spread module that may work since all of your servers are using Twisted.
On the subnet servers, wrap your site object in the ResourcePublisher object (a pb.Root subclass) and set that up to listen on an appropriate port.
On the main server, for each resource root that you want proxied out to a remote server, insert an appropriate child resource using ResourceSubscription.
A PB link used to transmit the requests between the two machines, which means that all of the server render() calls to be proxied are handled in a deferred fashion.
I expect there may be a way to interconnect your server into a client HTTP class to proxy to the subset servers using a more traditional web request, but given that you are Twisted throughout, probably no reason not to go ahead and use the PB approach.
It does look like these classes still use the older pb.getObjectAt approach for the connection rather than the newer getRootObject, but it should still work. If it doesn't directly suit what you want, it it may at least give you an idea for your own approach (since neither of those classes is overly complex).
-- David
_______________________________________________ Twisted-web mailing list Twisted-web@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
-- Mvh/Best regards, Thomas Weholt http://www.weholt.org

Thomas Weholt <thomas.weholt@gmail.com> writes:
Oh, darn!! Forgot all about this when I started my project. I've read thru most of the docs I've found so far, but I'm still somewhat clueless. Can anybody provide a simple example of how to do this, preferrably without using the examples in http://twistedmatrix.com/documents/current/howto/using-twistedweb#auto19. I'm hooking a xmlrpc-handler and a UDP listener into it as well and find the good ol' (...)
Well, I experimented tonight and here's a quick example of some situations. In re-reading your original post, I realized it wasn't clear if your "reads data from several subnet servers" comment referred to wanting to access web resources on the internal servers, or just some other service you needed data from. If the latter, then you can use your own PB session that can be richer in interface than a web resource if you wanted, so I gave a simple example of that too. This example just runs both the simulated main server and subset server objects within the same process over a loopback connection, but should work identically over any other link. There is a main and child resource on the server side, and the same on the internal/subnet side, tied into a URL on the server side by using the ResourcePublisher/ResourceSubscription. An additional server side resource turns into its own pure PB call to a remote server object. Once running you can access the following URLs: http://localhost:8000 ExternalRoot http://localhost:8000/child ExternalChild http://localhost:8000/data InternalData.remote_getData() http://localhost:8000/internal InternalRoot http://localhost:8000/internal/child InternalChild In the InternalData case, I'm making a new PB session for each rendering request. Presumably you'd want to structure things to maintain a persistent connection only reconnecting when necessary (which, BTW, is basically what ResourceSubscription does). It all seems to work as expected... it's simplistic but hopefully it'll point you in the right direction. Shouldn't be any problem to tie in additional protocols (such as XMLRPC/UDP) into either the main server or subnet server processes. -- David - - - - - - - - - - - - - - - - - - - - - - - - - import sys from twisted.python import log from twisted.internet import reactor from twisted.spread import pb from twisted.web import server, resource, distrib # # An internal PB server object on the internal subnet server, with simple # direct access (no authentication) via a root object. class InternalData(pb.Root): def remote_getData(self): # This could be its own deferrable operation return 'Internal data' # # Resources on the internal subnet servers # class InternalChild(resource.Resource): """A child resource rendered on the internal server""" def render(self, request): return '<html><body>Internal Child</body></html>' class InternalRoot(resource.Resource): """The root of the tree rendered on the internal server""" def getChild(self, path, request): # Support direct rendering (no trailing "/" on request) if path == '': return self else: return resource.Resource.getChild(self, path, request) def render(self, request): return '<html><body>Internal Root</body></html>' # # Resources on the primary web server # class MainData(resource.Resource): """A child resource that renders a data call to the internal server""" def __init__(self, host, port): resource.Resource.__init__(self) self.host = host self.port = port def render(self, request): """Make a request to the remote root object, and use that result as the result of our rendering""" def failure(value): request.write('<html><body>' 'Unable to access data:<br>%s' '</body></html>' % value) def success(value): request.write('<html><body>%s</body></html>' % value) # Right now we make a new connection to the internal host for # each rendering request (pretty darn inefficient!) factory = pb.PBClientFactory() reactor.connectTCP(self.host, self.port, factory) # Obtain a reference to the remote object, call the getData method # and then disconnect. root = factory.getRootObject() root.addCallback(lambda root: root.callRemote('getData').addCallback(success)) root.addErrback(failure) root.addCallback(lambda _: request.finish()) root.addCallback(lambda _: factory.disconnect()) return server.NOT_DONE_YET class MainChild(resource.Resource): """A child resource rendered directly on the main server""" def render(self, request): return '<html><body>Main Server Child</body></html>' class MainRoot(resource.Resource): """The primary root resource on the main server""" def getChild(self, path, request): # Support direct rendering (no trailing "/" on request) if path == '': return self else: return resource.Resource.getChild(self, path, request) def render(self, request): return '<html><body>Main Server Root</body></html>' # # Simulate main and subnet servers. The main server will listen on port # 8000 and the subnet server will listen (for PB connections) on port 8001. # Additionally the subnet server will provide the InternalData object # on port 8002. # if __name__ == "__main__": # # Build up a subnet server "site": # / InternalRoot # /child InternalChild # iroot = InternalRoot() iroot.putChild('child', InternalChild()) isite = server.Site(iroot) # # Build up the main server "site": # / MainRoot # /child MainChild # /data Render result of call to InternalData's retrieveData # /internal Request to / on subnet server # root = MainRoot() root.putChild('child', MainChild()) root.putChild('data', MainData('localhost', 8002)) root.putChild('internal', distrib.ResourceSubscription('localhost',8001)) site = server.Site(root) # # Now start both servers listening. Note that if these were really # running on separate machines, the internal server could do a listenTCP # for isite on 8000 to support normal web lookups, while also supporting # port 8001 for the PB proxied lookups. # reactor.listenTCP(8000, site) reactor.listenTCP(8001, pb.PBServerFactory(distrib.ResourcePublisher(isite))) reactor.listenTCP(8002,pb.PBServerFactory(InternalData())) log.startLogging(sys.stdout) reactor.run()

Sweet!!! I'm going to try it when I get off work. In my original design and implemented prototype the slave nodes on the subnet answers a UDP broadcast from the main server, which in turns keeps a list of slave-nodes on the local subnet, connecting to them and communicating with them using a mix of UDP and XMLRPC. Making a connection to the local slavenodes or adding them to the main server will be a bit more tricky using the code you supplied, but hey !!! If I can re-implement my idea using something like this it would be so much better. Thanks again!! :-) Best regards, Thomas On 29 Sep 2004 02:40:47 -0400, David Bolen <db3l@fitlinxx.com> wrote:
Thomas Weholt <thomas.weholt@gmail.com> writes:
Oh, darn!! Forgot all about this when I started my project. I've read thru most of the docs I've found so far, but I'm still somewhat clueless. Can anybody provide a simple example of how to do this, preferrably without using the examples in http://twistedmatrix.com/documents/current/howto/using-twistedweb#auto19. I'm hooking a xmlrpc-handler and a UDP listener into it as well and find the good ol' (...)
Well, I experimented tonight and here's a quick example of some situations. In re-reading your original post, I realized it wasn't clear if your "reads data from several subnet servers" comment referred to wanting to access web resources on the internal servers, or just some other service you needed data from. If the latter, then you can use your own PB session that can be richer in interface than a web resource if you wanted, so I gave a simple example of that too.
This example just runs both the simulated main server and subset server objects within the same process over a loopback connection, but should work identically over any other link.
There is a main and child resource on the server side, and the same on the internal/subnet side, tied into a URL on the server side by using the ResourcePublisher/ResourceSubscription. An additional server side resource turns into its own pure PB call to a remote server object. Once running you can access the following URLs:
http://localhost:8000 ExternalRoot http://localhost:8000/child ExternalChild http://localhost:8000/data InternalData.remote_getData() http://localhost:8000/internal InternalRoot http://localhost:8000/internal/child InternalChild
In the InternalData case, I'm making a new PB session for each rendering request. Presumably you'd want to structure things to maintain a persistent connection only reconnecting when necessary (which, BTW, is basically what ResourceSubscription does).
It all seems to work as expected... it's simplistic but hopefully it'll point you in the right direction. Shouldn't be any problem to tie in additional protocols (such as XMLRPC/UDP) into either the main server or subnet server processes.
-- David
- - - - - - - - - - - - - - - - - - - - - - - - -
import sys
from twisted.python import log from twisted.internet import reactor from twisted.spread import pb from twisted.web import server, resource, distrib
# # An internal PB server object on the internal subnet server, with simple # direct access (no authentication) via a root object.
class InternalData(pb.Root):
def remote_getData(self): # This could be its own deferrable operation return 'Internal data'
# # Resources on the internal subnet servers #
class InternalChild(resource.Resource): """A child resource rendered on the internal server"""
def render(self, request): return '<html><body>Internal Child</body></html>'
class InternalRoot(resource.Resource): """The root of the tree rendered on the internal server"""
def getChild(self, path, request): # Support direct rendering (no trailing "/" on request) if path == '': return self else: return resource.Resource.getChild(self, path, request)
def render(self, request): return '<html><body>Internal Root</body></html>'
# # Resources on the primary web server #
class MainData(resource.Resource): """A child resource that renders a data call to the internal server"""
def __init__(self, host, port): resource.Resource.__init__(self) self.host = host self.port = port
def render(self, request): """Make a request to the remote root object, and use that result as the result of our rendering"""
def failure(value): request.write('<html><body>' 'Unable to access data:<br>%s' '</body></html>' % value)
def success(value): request.write('<html><body>%s</body></html>' % value)
# Right now we make a new connection to the internal host for # each rendering request (pretty darn inefficient!) factory = pb.PBClientFactory() reactor.connectTCP(self.host, self.port, factory)
# Obtain a reference to the remote object, call the getData method # and then disconnect. root = factory.getRootObject() root.addCallback(lambda root: root.callRemote('getData').addCallback(success)) root.addErrback(failure) root.addCallback(lambda _: request.finish()) root.addCallback(lambda _: factory.disconnect()) return server.NOT_DONE_YET
class MainChild(resource.Resource): """A child resource rendered directly on the main server"""
def render(self, request): return '<html><body>Main Server Child</body></html>'
class MainRoot(resource.Resource): """The primary root resource on the main server"""
def getChild(self, path, request): # Support direct rendering (no trailing "/" on request) if path == '': return self else: return resource.Resource.getChild(self, path, request)
def render(self, request): return '<html><body>Main Server Root</body></html>'
# # Simulate main and subnet servers. The main server will listen on port # 8000 and the subnet server will listen (for PB connections) on port 8001. # Additionally the subnet server will provide the InternalData object # on port 8002. #
if __name__ == "__main__":
# # Build up a subnet server "site": # / InternalRoot # /child InternalChild # iroot = InternalRoot() iroot.putChild('child', InternalChild()) isite = server.Site(iroot)
# # Build up the main server "site": # / MainRoot # /child MainChild # /data Render result of call to InternalData's retrieveData # /internal Request to / on subnet server # root = MainRoot() root.putChild('child', MainChild()) root.putChild('data', MainData('localhost', 8002)) root.putChild('internal', distrib.ResourceSubscription('localhost',8001)) site = server.Site(root)
# # Now start both servers listening. Note that if these were really # running on separate machines, the internal server could do a listenTCP # for isite on 8000 to support normal web lookups, while also supporting # port 8001 for the PB proxied lookups. # reactor.listenTCP(8000, site) reactor.listenTCP(8001, pb.PBServerFactory(distrib.ResourcePublisher(isite))) reactor.listenTCP(8002,pb.PBServerFactory(InternalData()))
log.startLogging(sys.stdout) reactor.run()
_______________________________________________ Twisted-web mailing list Twisted-web@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
-- Mvh/Best regards, Thomas Weholt http://www.weholt.org

Thomas Weholt <thomas.weholt@gmail.com> writes:
Sweet!!! I'm going to try it when I get off work. In my original design and implemented prototype the slave nodes on the subnet answers a UDP broadcast from the main server, which in turns keeps a list of slave-nodes on the local subnet, connecting to them and communicating with them using a mix of UDP and XMLRPC. Making a connection to the local slavenodes or adding them to the main server will be a bit more tricky using the code you supplied, but hey !!! If I can re-implement my idea using something like this it would be so much better.
Note that there's nothing that says you have to use PB to make the internal requests. If you already have a working UDP/XMLRPC mechanism, just go ahead and keep using it. Any resource's render() operation can just return server.NOT_DONE_YET, and then manage the request itself, completing it whenever it can, deferrable or not. Just request.write() whatever data you eventually want to, and don't forget to do the request.finish() when done. You could just as easily make an XMLRPC request to an internal server from within a render() as a callRemote to a PB server. -- David
participants (2)
-
David Bolen
-
Thomas Weholt