rest webservice and big data.
Hi I'm new to twisted programming and I'm wonder how to do the following thing. I would like to save in a file the content of a PUT method. But i need this in a stream mode ( the data may be handred of MB ) Here is a part of my code. class DataResource(resource.Resource): def __init__(self, dbConnection): resource.Resource.__init__(self) def render_PUT(self, request): request.content.seek(0) file('data.dat','wb').write(request.content.read()) request.write('OK') request.finish() Thats inspired by an exemple of the oreilly book. Is there a way to get a coolback juste after the header was sended and to handle the reading of the remaining data myself ? I use only web, not the new web2 api. Thanks in advance for any help or link. Sebastien.
Hi Sébastien, On Mon, 20 Aug 2007 12:11:33 -0500, Sébastien HEITZMANN <2le@2le.net> wrote:
Hi
I'm new to twisted programming and I'm wonder how to do the following thing.
I would like to save in a file the content of a PUT method. But i need this in a stream mode ( the data may be handred of MB )
Here is a part of my code.
class DataResource(resource.Resource): def __init__(self, dbConnection): resource.Resource.__init__(self)
def render_PUT(self, request): request.content.seek(0) file('data.dat','wb').write(request.content.read()) request.write('OK') request.finish()
Thats inspired by an exemple of the oreilly book.
Is there a way to get a coolback juste after the header was sended and to handle the reading of the remaining data myself ?
I use only web, not the new web2 api.
You cannot stream large files using twisted.web unless you write your own mechanism. On the other hand, web2 *does* support streaming file uploads, so I would advise you to think about using web2 instead, if you really want streaming. Someone with deeper knowledge of twisted.web may be able to propose a strategy for implementing streaming file uploads, but I expect it would be a fair amount of work, and end up looking similar to what is already in web2.
Thanks in advance for any help or link.
Sebastien.
Hope this helps, -- L. Daniel Burr
On Mon, 20 Aug 2007 12:40:49 -0500, "L. Daniel Burr" <ldanielburr@mac.com> wrote:
Hi Sébastien,
On Mon, 20 Aug 2007 12:11:33 -0500, Sébastien HEITZMANN <2le@2le.net> wrote:
Hi
I'm new to twisted programming and I'm wonder how to do the following thing.
I would like to save in a file the content of a PUT method. But i need this in a stream mode ( the data may be handred of MB )
Here is a part of my code.
class DataResource(resource.Resource): def __init__(self, dbConnection): resource.Resource.__init__(self)
def render_PUT(self, request): request.content.seek(0) file('data.dat','wb').write(request.content.read()) request.write('OK') request.finish()
Thats inspired by an exemple of the oreilly book.
Is there a way to get a coolback juste after the header was sended and to handle the reading of the remaining data myself ?
I use only web, not the new web2 api.
You cannot stream large files using twisted.web unless you write your own mechanism. On the other hand, web2 *does* support streaming file uploads, so I would advise you to think about using web2 instead, if you really want streaming.
Someone with deeper knowledge of twisted.web may be able to propose a strategy for implementing streaming file uploads
Don't mind if I do ;) HTTPChannel already notices the difference between when the headers have all been received and when the body has been received entirely. When the former occurs, allHeadersReceived is called. In the base implementation this sets up a file-like object into which the body will be written. It would be possible to do something slightly different here in order to support streaming uploads: do resource traversal to find the IResource the upload is being sent to and then let it deal with bytes received in the body of the request. The only other things which might not be obvious here. Changes to twisted.web should be backwards compatible so that existing twisted.web applications continue to work without being modified. Implementing what I've described above without regard for backwards compatibility would probably mean subjecting existing applications to two things: * resource traversal would be performed earlier than usual for the application. This might have adverse consequences, or it might not. In the absense of any way to know for sure, we shouldn't change this behavior. So, instead, the code might require a new kind of site, or an attribute to be set on the root resource, or something else of this sort which would allow new applications to indicate their preference for the new behavior while preserving the existing behavior for existing applications. * The body of a request is currently available in the request object itself. Existing applications won't expect it to be elsewhere, nor will they expect to have to handle the upload as it is happening. It should be required that resources indicate in some way that they are capable of handling streaming uploads. This might be done by adding a new interface which they must implement (since they will need to provide methods for handling bytes from the upload, this is necessary anyway).
but I expect it would be a fair amount of work, and end up looking similar to what is already in web2.
Well, "fair" is quite subjective, so maybe it is and maybe it isn't ;) It doesn't strike me as a massive undertaking, though. I think an initial patch could probably be done in a day or two. Allow another couple of days (not necessarily elapsed - there might be some latency in finding reviews, etc) to get feedback and make whatever improvements are suggested, and that would probably be it. FWIW, what I described doesn't resemble the support for this functionality in web2 at all, I think. Jean-Paul
On Mon, 20 Aug 2007 13:13:09 -0500, Jean-Paul Calderone <exarkun@divmod.com> wrote: [Discusion of streaming file uploads in twisted.web]
Someone with deeper knowledge of twisted.web may be able to propose a strategy for implementing streaming file uploads
Don't mind if I do ;)
Always good to have a core developer weigh in on these matters ;)
HTTPChannel already notices the difference between when the headers have all been received and when the body has been received entirely. When the former occurs, allHeadersReceived is called. In the base implementation this sets up a file-like object into which the body will be written. It would be possible to do something slightly different here in order to support streaming uploads: do resource traversal to find the IResource the upload is being sent to and then let it deal with bytes received in the body of the request.
That sounds pretty reasonable.
The only other things which might not be obvious here. Changes to twisted.web should be backwards compatible so that existing twisted.web applications continue to work without being modified. Implementing what I've described above without regard for backwards compatibility would probably mean subjecting existing applications to two things:
You lost me here. First you say that changes to twisted.web should be backwards-compatible, then you go on to describe how to do things in a non-compatible manner. I don't have a preference regarding the issue of compatibility, but I'm not clear as to whether you do.
* resource traversal would be performed earlier than usual for the application. This might have adverse consequences, or it might not. In the absense of any way to know for sure, we shouldn't change this behavior. So, instead, the code might require a new kind of site, or an attribute to be set on the root resource, or something else of this sort which would allow new applications to indicate their preference for the new behavior while preserving the existing behavior for existing applications.
Sure, this sounds a bit like the way nevow uses Element for doing things in the new, context-less way, while leaving Fragment in place to handle the existing, context-laden way.
* The body of a request is currently available in the request object itself. Existing applications won't expect it to be elsewhere, nor will they expect to have to handle the upload as it is happening. It should be required that resources indicate in some way that they are capable of handling streaming uploads. This might be done by adding a new interface which they must implement (since they will need to provide methods for handling bytes from the upload, this is necessary anyway).
but I expect it would be a fair amount of work, and end up looking similar to what is already in web2.
Well, "fair" is quite subjective, so maybe it is and maybe it isn't ;)
Point taken, although I'll point out that from the perspective of the original poster, "fair amount of work" might mean "lots of work and cursing", given the usual newbie experience with twisted and its learning curve.
It doesn't strike me as a massive undertaking, though. I think an initial patch could probably be done in a day or two. Allow another couple of days (not necessarily elapsed - there might be some latency in finding reviews, etc) to get feedback and make whatever improvements are suggested, and that would probably be it.
Well, that'd be pretty awesome, and would benefit nevow users too.
FWIW, what I described doesn't resemble the support for this functionality in web2 at all, I think.
True, and I'm somewhat surprised, given the work currently going on with web2. Isn't dialtone's consumer/producer oriented stream stuff going to be "the way" to do this sort of thing? Does twisted.web have to approach it differently, due to design constraints? Thanks for taking the time to think through this, L. Daniel Burr
On Mon, 20 Aug 2007 13:33:09 -0500, "L. Daniel Burr" <ldanielburr@mac.com> wrote:
[snip]
The only other things which might not be obvious here. Changes to twisted.web should be backwards compatible so that existing twisted.web applications continue to work without being modified. Implementing what I've described above without regard for backwards compatibility would probably mean subjecting existing applications to two things:
You lost me here. First you say that changes to twisted.web should be backwards-compatible, then you go on to describe how to do things in a non-compatible manner. I don't have a preference regarding the issue of compatibility, but I'm not clear as to whether you do.
Oops :) To clarify, when this is implemented, it should be implemented in a way which _is_ backwards compatible.
[snip]
FWIW, what I described doesn't resemble the support for this functionality in web2 at all, I think.
True, and I'm somewhat surprised, given the work currently going on with web2. Isn't dialtone's consumer/producer oriented stream stuff going to be "the way" to do this sort of thing? Does twisted.web have to approach it differently, due to design constraints?
Fitting this into twisted.web with an API similar to that of twisted.web2 would be more challenging. It might be possible though. Perhaps one of the web2 developers can comment on that in more detail. Jean-Paul
Sébastien HEITZMANN ha scritto:
Hi
I'm new to twisted programming and I'm wonder how to do the following thing.
I would like to save in a file the content of a PUT method. But i need this in a stream mode ( the data may be handred of MB )
You can use nginx as a proxy server. nginx can save the entire request body in a file. You can pass the file name to the application using an header.
[...]
Regards Manlio Perillo
participants (4)
-
Jean-Paul Calderone
-
L. Daniel Burr
-
Manlio Perillo
-
Sébastien HEITZMANN