[Twisted-Python] Sending large files over network with perspective broker
I'm writing a gtk application that transfer files in LAN. The application has a server and can spawn different clients (one for each file to send) . The flow between client and server is something like that: client asks pb.Root a FileSender ( that is conceptually a perspective). The client receive the file sender. The client declares the size and the basename of the file he is sending, requiring authorization to send. The server perform authorization and passes to the client a unique key to start the file transfer. The client performs the file transfer in "chunks". Each chunk is passed throught a remote method, send_chunk. The file transfer is done "recursively", each send_chunk deferred generates a new deferred for the next chunk. Some pseudocode to understand better my solution. FileSender: remote_get_auth(): remote_request_for_sending(filename, size): return secret remote send_chunk(secred, chunk_no, data): save the chunk somewhere Client() proceed_sending(): chunk_tot = CHUNK_TOT chunk_no = 0 def send(_) if chunk_no == CHUNK_TOT: return else: ... read data... d = filesender.callRemote("send_chunk", secret, chunk_no, data) d.addCallback(send) I've done in this way so a new chunk is sent only if the previous chunk was sent. The problem of this approach is that this blocks my GUI, I can't figure out why because I'm just generating deffereds so it souldn't block. I've seen the page about Consumer and Producer, however I can't figure out how to integrate producer and consumers in a Perspective Broker based code. Can someone help me?
Does send_chuck or callRemote block? -J On Sat, May 22, 2010 at 12:35 PM, Gabriele Lanaro <gabriele.lanaro@gmail.com> wrote:
I'm writing a gtk application that transfer files in LAN. The application has a server and can spawn different clients (one for each file to send) . The flow between client and server is something like that:
client asks pb.Root a FileSender ( that is conceptually a perspective). The client receive the file sender. The client declares the size and the basename of the file he is sending, requiring authorization to send. The server perform authorization and passes to the client a unique key to start the file transfer. The client performs the file transfer in "chunks". Each chunk is passed throught a remote method, send_chunk.
The file transfer is done "recursively", each send_chunk deferred generates a new deferred for the next chunk.
Some pseudocode to understand better my solution.
FileSender: remote_get_auth(): remote_request_for_sending(filename, size): return secret remote send_chunk(secred, chunk_no, data): save the chunk somewhere
Client() proceed_sending():
chunk_tot = CHUNK_TOT chunk_no = 0
def send(_) if chunk_no == CHUNK_TOT: return else: ... read data... d = filesender.callRemote("send_chunk", secret, chunk_no, data) d.addCallback(send)
I've done in this way so a new chunk is sent only if the previous chunk was sent. The problem of this approach is that this blocks my GUI, I can't figure out why because I'm just generating deffereds so it souldn't block.
I've seen the page about Consumer and Producer, however I can't figure out how to integrate producer and consumers in a Perspective Broker based code.
Can someone help me?
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Gabriele Lanaro <gabriele.lanaro@gmail.com> writes:
The problem of this approach is that this blocks my GUI, I can't figure out why because I'm just generating deffereds so it souldn't block.
Just using deferreds won't help unless you still manage to return control back up the chain to the main event loop. I suspect something must be blocking somewhere, though hard to say from the pseudo-code. Most likely a few judiciously placed logging statements would let you see where, or at least verify that you are not returning to the main event loop during the transfer. I will say that chunking up a large transfer through individual PB requests adds a bit of overhead for a large stream, and unless you implement some sort of windowing protocol, can hurt performance due to the latency needed to wait for the ACK from the server for each chunk. I had what appears to be a similar requirement in terms of transmitting a large file (A/V files to be published) as part of an overall PB session, and decided to separate it out to its own file upload server component coordinated through the PB session. http://twistedmatrix.com/pipermail/twisted-python/2007-July/015738.html has some further details on what I ended up doing. Perhaps an approach along these lines would work for you as well. -- David
Thank you very much for your responses, the problem seems to appear when in my tests I send the file "to myself", when I send files over the network, the things goes well. I suspect (it's just a suspect) that the code spawns too much deferreds too much fast, causing the loop not to complete (or something like that). Anyway I think I'll follow your suggestion and I'll end splitting up the upload service and the control/authorization one, since the code can grow up too complex and performance is a requirement. - Gabriele 2010/5/23 David Bolen <db3l.net@gmail.com>
Gabriele Lanaro <gabriele.lanaro@gmail.com> writes:
The problem of this approach is that this blocks my GUI, I can't figure out why because I'm just generating deffereds so it souldn't block.
Just using deferreds won't help unless you still manage to return control back up the chain to the main event loop. I suspect something must be blocking somewhere, though hard to say from the pseudo-code. Most likely a few judiciously placed logging statements would let you see where, or at least verify that you are not returning to the main event loop during the transfer.
I will say that chunking up a large transfer through individual PB requests adds a bit of overhead for a large stream, and unless you implement some sort of windowing protocol, can hurt performance due to the latency needed to wait for the ACK from the server for each chunk.
I had what appears to be a similar requirement in terms of transmitting a large file (A/V files to be published) as part of an overall PB session, and decided to separate it out to its own file upload server component coordinated through the PB session.
http://twistedmatrix.com/pipermail/twisted-python/2007-July/015738.html has some further details on what I ended up doing. Perhaps an approach along these lines would work for you as well.
-- David
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
After putting some strategical sleeps seems that the problem is that the application is the server and the client at once, I think that this generates a "loop" in the mainloop. 2010/5/23 Gabriele Lanaro <gabriele.lanaro@gmail.com>
Thank you very much for your responses, the problem seems to appear when in my tests I send the file "to myself", when I send files over the network, the things goes well. I suspect (it's just a suspect) that the code spawns too much deferreds too much fast, causing the loop not to complete (or something like that).
Anyway I think I'll follow your suggestion and I'll end splitting up the upload service and the control/authorization one, since the code can grow up too complex and performance is a requirement.
- Gabriele
2010/5/23 David Bolen <db3l.net@gmail.com>
Gabriele Lanaro <gabriele.lanaro@gmail.com> writes:
The problem of this approach is that this blocks my GUI, I can't figure out why because I'm just generating deffereds so it souldn't block.
Just using deferreds won't help unless you still manage to return control back up the chain to the main event loop. I suspect something must be blocking somewhere, though hard to say from the pseudo-code. Most likely a few judiciously placed logging statements would let you see where, or at least verify that you are not returning to the main event loop during the transfer.
I will say that chunking up a large transfer through individual PB requests adds a bit of overhead for a large stream, and unless you implement some sort of windowing protocol, can hurt performance due to the latency needed to wait for the ACK from the server for each chunk.
I had what appears to be a similar requirement in terms of transmitting a large file (A/V files to be published) as part of an overall PB session, and decided to separate it out to its own file upload server component coordinated through the PB session.
http://twistedmatrix.com/pipermail/twisted-python/2007-July/015738.html has some further details on what I ended up doing. Perhaps an approach along these lines would work for you as well.
-- David
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Finally I managed to solve myself this obscure bug. I put the reactor.iterate() call before spawning new callbacks, in this way I force the mainloop to complete the cycle. The pseudocode would be modified in this way: Client() proceed_sending(): chunk_tot = CHUNK_TOT chunk_no = 0 def send(_) if chunk_no == CHUNK_TOT: return else: ... read data... reactor.iterate() <---------------------------------------------------------- THIS LINE d = filesender.callRemote("send_ chunk", secret, chunk_no, data) d.addCallback(send) 2010/5/23 Gabriele Lanaro <gabriele.lanaro@gmail.com>
After putting some strategical sleeps seems that the problem is that the application is the server and the client at once, I think that this generates a "loop" in the mainloop.
2010/5/23 Gabriele Lanaro <gabriele.lanaro@gmail.com>
Thank you very much for your responses, the problem seems to appear when in
my tests I send the file "to myself", when I send files over the network, the things goes well. I suspect (it's just a suspect) that the code spawns too much deferreds too much fast, causing the loop not to complete (or something like that).
Anyway I think I'll follow your suggestion and I'll end splitting up the upload service and the control/authorization one, since the code can grow up too complex and performance is a requirement.
- Gabriele
2010/5/23 David Bolen <db3l.net@gmail.com>
Gabriele Lanaro <gabriele.lanaro@gmail.com> writes:
The problem of this approach is that this blocks my GUI, I can't figure out why because I'm just generating deffereds so it souldn't block.
Just using deferreds won't help unless you still manage to return control back up the chain to the main event loop. I suspect something must be blocking somewhere, though hard to say from the pseudo-code. Most likely a few judiciously placed logging statements would let you see where, or at least verify that you are not returning to the main event loop during the transfer.
I will say that chunking up a large transfer through individual PB requests adds a bit of overhead for a large stream, and unless you implement some sort of windowing protocol, can hurt performance due to the latency needed to wait for the ACK from the server for each chunk.
I had what appears to be a similar requirement in terms of transmitting a large file (A/V files to be published) as part of an overall PB session, and decided to separate it out to its own file upload server component coordinated through the PB session.
http://twistedmatrix.com/pipermail/twisted-python/2007-July/015738.html has some further details on what I ended up doing. Perhaps an approach along these lines would work for you as well.
-- David
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On 12:21 pm, gabriele.lanaro@gmail.com wrote:
Finally I managed to solve myself this obscure bug. I put the reactor.iterate() call before spawning new callbacks, in this way I force the mainloop to complete the cycle. The pseudocode would be modified in this way:
Client() proceed_sending():
chunk_tot = CHUNK_TOT chunk_no = 0
def send(_) if chunk_no == CHUNK_TOT: return else: ... read data... reactor.iterate() <---------------------------------------------------------- THIS LINE d = filesender.callRemote("send_ chunk", secret, chunk_no, data) d.addCallback(send)
Erm. Sorry. This isn't a solution to whatever problem you're having. It is entirely invalid to use reactor.iterate() in this way. Jean-Paul
In which sense it's invalid? I don't know how the gtk reactor works, I just guessed that the event loop never reaches the gui events. My idea was to force the processing of these events before spawning another deferred, it's just a workaround, the real problem is the fact that the server and the client resides in the same loop (for testing). Which can be the reason of the mainloop "block"? 2010/5/23 <exarkun@twistedmatrix.com>
On 12:21 pm, gabriele.lanaro@gmail.com wrote:
Finally I managed to solve myself this obscure bug. I put the reactor.iterate() call before spawning new callbacks, in this way I force the mainloop to complete the cycle. The pseudocode would be modified in this way:
Client() proceed_sending():
chunk_tot = CHUNK_TOT chunk_no = 0
def send(_) if chunk_no == CHUNK_TOT: return else: ... read data... reactor.iterate() <---------------------------------------------------------- THIS LINE d = filesender.callRemote("send_ chunk", secret, chunk_no, data) d.addCallback(send)
Erm. Sorry. This isn't a solution to whatever problem you're having. It is entirely invalid to use reactor.iterate() in this way.
Jean-Paul
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On May 23, 2010, at 6:36 PM, Gabriele Lanaro wrote:
In which sense it's invalid? I don't know how the gtk reactor works, I just guessed that the event loop never reaches the gui events. My idea was to force the processing of these events before spawning another deferred, it's just a workaround, the real problem is the fact that the server and the client resides in the same loop (for testing).
It's invalid to run reactor.iterate() inside the reactor mainloop. You can't force event-processing order in Twisted; if you want an event to not get processed, you need to delay its event source from getting invoked (producer.pauseProducing(), transport.stopReading(), transport.stopWriting(), Deferred.pause() are all ways to do this). It's invalid to use reactor.iterate() in this way because the reactor may invoke you reentrantly and there's no sane way to handle that. For example, your code is running because select() said your file descriptor was ready for reading, which then invoked dataReceived, which then invoked your method with buffered data, which then called iterate(), which then called dataReceived, which then called your method with buffered data, which then called iterate(), which then ... (and so on, forever, unless your application code conflicts with itself and running and starts blowing up and throwing incomprehensible tracebacks everywhere because of "impossible" recursion.
Which can be the reason of the mainloop "block"?
Lots of reasons. The example you gave wasn't syntactically valid Python, so it's hard to say. Consider sending along an <http://sscce.org/> and maybe we can tell you more :).
Thank you very much for your explanation! Now I've understood what's the problem in iterate(). On these lines I've prepared a this little test case (in attach): Test 1 $ python runner.py click on the button minimize/unminimize window to force a widget redraw, the window should be blank. you can click another time on the button, the event is catched but no "button animation" is performed Test2 $ python runner.py open another console $ python runner_other_process.py click on the button of runner_other_process, this connects itself to the server in the first process each window should redraw correctly 2010/5/24 Glyph Lefkowitz <glyph@twistedmatrix.com>
On May 23, 2010, at 6:36 PM, Gabriele Lanaro wrote:
In which sense it's invalid? I don't know how the gtk reactor works, I just guessed that the event loop never reaches the gui events. My idea was to force the processing of these events before spawning another deferred, it's just a workaround, the real problem is the fact that the server and the client resides in the same loop (for testing).
It's invalid to run reactor.iterate() inside the reactor mainloop. You can't force event-processing order in Twisted; if you want an event to not get processed, you need to delay its event source from getting invoked (producer.pauseProducing(), transport.stopReading(), transport.stopWriting(), Deferred.pause() are all ways to do this).
It's invalid to use reactor.iterate() in this way because the reactor may invoke you reentrantly and there's no sane way to handle that.
For example, your code is running because select() said your file descriptor was ready for reading, which then invoked dataReceived, which then invoked your method with buffered data, which then called iterate(), which then called dataReceived, which then called your method with buffered data, which then called iterate(), which then ... (and so on, forever, unless your application code conflicts with itself and running and starts blowing up and throwing incomprehensible tracebacks everywhere because of "impossible" recursion.
Which can be the reason of the mainloop "block"?
Lots of reasons. The example you gave wasn't syntactically valid Python, so it's hard to say. Consider sending along an <http://sscce.org/> and maybe we can tell you more :).
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On 08:39 am, gabriele.lanaro@gmail.com wrote:
Thank you very much for your explanation! Now I've understood what's the problem in iterate().
On these lines I've prepared a this little test case (in attach):
Test 1
$ python runner.py click on the button minimize/unminimize window to force a widget redraw, the window should be blank. you can click another time on the button, the event is catched but no "button animation" is performed
Test2
$ python runner.py open another console $ python runner_other_process.py click on the button of runner_other_process, this connects itself to the server in the first process each window should redraw correctly
This may demonstrate a bug in gtk2reactor. It seems to be servicing network events to the exclusion of GUI events, which it isn't supposed to do. I don't see any obvious reason for this. Unfortunately glib2 (or pygtk2, perhaps) is ultimately in charge of the ordering/priority of these event handlers. gtk2reactor is just a thin layer on top of the glib2-supplied I/O notification APIs. But perhaps there's a way we could be invoking these APIs differently so that the GUI gets more of a chance to run. Jean-Paul
I've this problem also with consumer/producer over a LineReceiver protocol, should I submit a bug report? 2010/5/24 <exarkun@twistedmatrix.com>
On 08:39 am, gabriele.lanaro@gmail.com wrote:
Thank you very much for your explanation! Now I've understood what's the problem in iterate().
On these lines I've prepared a this little test case (in attach):
Test 1
$ python runner.py click on the button minimize/unminimize window to force a widget redraw, the window should be blank. you can click another time on the button, the event is catched but no "button animation" is performed
Test2
$ python runner.py open another console $ python runner_other_process.py click on the button of runner_other_process, this connects itself to the server in the first process each window should redraw correctly
This may demonstrate a bug in gtk2reactor. It seems to be servicing network events to the exclusion of GUI events, which it isn't supposed to do.
I don't see any obvious reason for this. Unfortunately glib2 (or pygtk2, perhaps) is ultimately in charge of the ordering/priority of these event handlers. gtk2reactor is just a thin layer on top of the glib2-supplied I/O notification APIs. But perhaps there's a way we could be invoking these APIs differently so that the GUI gets more of a chance to run.
Jean-Paul
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
participants (5)
-
David Bolen
-
exarkun@twistedmatrix.com
-
Gabriele Lanaro
-
Glyph Lefkowitz
-
Jason J. W. Williams