Hello,
I have been trying to use the producer/consumer paradigm to transfer large strings. I've been having trouble because I have strange things going on, but I now suspect it isn't the P/C that's not working, because it works fine with smaller strings than the ones I'm having trouble with, so I suspect the problem comes from elsewhere. What happens is it starts off fine, but then the destination at some point only gets part of the string but the source has already sent everything, closed it's transport and called stopProducing(). Could there be some internal buffering problem, like I'm sending too much data at too high of a throughput, or maybe something like some sort of max time a socket can stay open or something? If I tell it to split the string up into smaller parts, it doesn't change anything, if I make the string smaller, whatever the size of each chunk, it transfers fine.
Thanks, Gabriel
Gabriel Rossetti wrote:
Hello,
I have been trying to use the producer/consumer paradigm to transfer large strings. I've been having trouble because I have strange things going on, but I now suspect it isn't the P/C that's not working, because it works fine with smaller strings than the ones I'm having trouble with, so I suspect the problem comes from elsewhere. What happens is it starts off fine, but then the destination at some point only gets part of the string but the source has already sent everything, closed it's transport and called stopProducing(). Could there be some internal buffering problem, like I'm sending too much data at too high of a throughput, or maybe something like some sort of max time a socket can stay open or something? If I tell it to split the string up into smaller parts, it doesn't change anything, if I make the string smaller, whatever the size of each chunk, it transfers fine.
Thanks, Gabriel
I removed the P/C code and used the previous version and the problem persists, if I send a XML msg (very small XML part) with a data payload containing a string generated like this :
initialMsg = "".join([str(x) for x in range(5000)])
which is 10374 bytes of data (including the XML), this makes me suspect that there is maybe some sort of buffer or something that is full and doesn't have the time to empty and twisted stops working or something. I'm using the 2.5 version since Ubuntu hasn't upgraded (and apparently won't upgrade until the lenny) their version yet and I can't get it to compile from source (yes, I installed python-dev and build-essentials), so maybe it's a problem only to this version. Does anyone have an Idea of what the problem is? I'm using the following code to send messages, I don't think that's the problem but you never know :
def sendMessage(address, port, message, needAnswer=False):
d = defer.Deferred() if needAnswer else None
class MessageSender(Protocol):
def sendMessage(self, msg): if domish.IElement.providedBy(msg): msg = msg.toXml()
if isinstance(msg, unicode): msg = msg.encode('utf-8')
self.transport.write(msg)
def dataReceived(self, data): d.callback(data)
def gotProtocol(proto): proto.sendMessage(message) if(not needAnswer): proto.transport.loseConnection()
c = ClientCreator(reactor, MessageSender) c.connectTCP(address, port).addCallback(gotProtocol) return d
Gabriel
Gabriel Rossetti wrote:
Gabriel Rossetti wrote:
Hello,
I have been trying to use the producer/consumer paradigm to transfer large strings. I've been having trouble because I have strange things going on, but I now suspect it isn't the P/C that's not working, because it works fine with smaller strings than the ones I'm having trouble with, so I suspect the problem comes from elsewhere. What happens is it starts off fine, but then the destination at some point only gets part of the string but the source has already sent everything, closed it's transport and called stopProducing(). Could there be some internal buffering problem, like I'm sending too much data at too high of a throughput, or maybe something like some sort of max time a socket can stay open or something? If I tell it to split the string up into smaller parts, it doesn't change anything, if I make the string smaller, whatever the size of each chunk, it transfers fine.
Thanks, Gabriel
I removed the P/C code and used the previous version and the problem persists, if I send a XML msg (very small XML part) with a data payload containing a string generated like this :
initialMsg = "".join([str(x) for x in range(5000)])
which is 10374 bytes of data (including the XML), this makes me suspect that there is maybe some sort of buffer or something that is full and doesn't have the time to empty and twisted stops working or something. I'm using the 2.5 version since Ubuntu hasn't upgraded (and apparently won't upgrade until the lenny) their version yet and I can't get it to compile from source (yes, I installed python-dev and build-essentials), so maybe it's a problem only to this version. Does anyone have an Idea of what the problem is? I'm using the following code to send messages, I don't think that's the problem but you never know :
def sendMessage(address, port, message, needAnswer=False):
d = defer.Deferred() if needAnswer else None class MessageSender(Protocol): def sendMessage(self, msg): if domish.IElement.providedBy(msg): msg = msg.toXml() if isinstance(msg, unicode): msg = msg.encode('utf-8') self.transport.write(msg) def dataReceived(self, data): d.callback(data) def gotProtocol(proto): proto.sendMessage(message) if(not needAnswer): proto.transport.loseConnection() c = ClientCreator(reactor, MessageSender) c.connectTCP(address, port).addCallback(gotProtocol) return d
Gabriel
Ok, so apparently it (Twisted, Python, the OS?) can buffer and send up to 16k, after that it splits the message up, thus this would explain the lockup, since the app expects a whole XML message and can't process the second part correctly. So if I understand it correctly, even when I use the P/C paradigm, it waits until the buffer is full before sending the message, so when I get the message on the other side, I only get part of it, the other half is a new message/data arrival, and thus like I said above, the XML is not correct.
Does anyone have an idea of how to solve this?
Thanks, Gabriel
On Wed, 23 Apr 2008 10:41:29 +0200, Gabriel Rossetti mailing_lists@evotex.ch wrote:
[snip] Ok, so apparently it (Twisted, Python, the OS?) can buffer and send up to 16k, after that it splits the message up, thus this would explain the lockup, since the app expects a whole XML message and can't process the second part correctly. So if I understand it correctly, even when I use the P/C paradigm, it waits until the buffer is full before sending the message, so when I get the message on the other side, I only get part of it, the other half is a new message/data arrival, and thus like I said above, the XML is not correct.
Does anyone have an idea of how to solve this?
This can't be solved. From the way you describe it, the software on the receiving end is simply broken. TCP provides no guarantees about how much data will be delivered to the recipient at a time, regardless of how much is sent at a time. Every participant along the delivery path between the sender and the recipient is allowed to break packets into smaller pieces or coalesce packets into larger pieces. The recipient *must* be able to handle incomplete messages by waiting for more bytes. It must also be able to handle packets which contain bytes from more than one message.
There are a number of ways to address this. Almost all of them involve changing the software running on the peer you're sending messages to and the protocol the two programs are using to talk to each other. For example, you can send a length prefix before each message allowing the recipient to buffer up the correct number of bytes before trying to deal with them.
Jean-Paul
Jean-Paul Calderone wrote:
On Wed, 23 Apr 2008 10:41:29 +0200, Gabriel Rossetti mailing_lists@evotex.ch wrote:
[snip] Ok, so apparently it (Twisted, Python, the OS?) can buffer and send up to 16k, after that it splits the message up, thus this would explain the lockup, since the app expects a whole XML message and can't process the second part correctly. So if I understand it correctly, even when I use the P/C paradigm, it waits until the buffer is full before sending the message, so when I get the message on the other side, I only get part of it, the other half is a new message/data arrival, and thus like I said above, the XML is not correct.
Does anyone have an idea of how to solve this?
This can't be solved. From the way you describe it, the software on the receiving end is simply broken. TCP provides no guarantees about how much data will be delivered to the recipient at a time, regardless of how much is sent at a time. Every participant along the delivery path between the sender and the recipient is allowed to break packets into smaller pieces or coalesce packets into larger pieces. The recipient *must* be able to handle incomplete messages by waiting for more bytes. It must also be able to handle packets which contain bytes from more than one message.
There are a number of ways to address this. Almost all of them involve changing the software running on the peer you're sending messages to and the protocol the two programs are using to talk to each other. For example, you can send a length prefix before each message allowing the recipient to buffer up the correct number of bytes before trying to deal with them.
Jean-Paul
Thank you Jean-Paul, I'll do something like that then and fix the peer, I didn't realize that and thought I could get the message as a whole somehow. At lease I know my P/C is working :-)
Gabriel
On Wednesday 23 April 2008, Gabriel Rossetti wrote:
Ok, so apparently it (Twisted, Python, the OS?) can buffer and send up to 16k, after that it splits the message up, thus this would explain the lockup, since the app expects a whole XML message and can't process the second part correctly.
If you use an event-based XML parser (like something that implements SAX), you can simply feed it the bytes as they come in from the network. The XML tag nesting will tell you when the entire message has been received, which is when you receive the "close tag" callback for the root tag.
Bye, Maarten
Maarten ter Huurne wrote:
On Wednesday 23 April 2008, Gabriel Rossetti wrote:
Ok, so apparently it (Twisted, Python, the OS?) can buffer and send up to 16k, after that it splits the message up, thus this would explain the lockup, since the app expects a whole XML message and can't process the second part correctly.
If you use an event-based XML parser (like something that implements SAX), you can simply feed it the bytes as they come in from the network. The XML tag nesting will tell you when the entire message has been received, which is when you receive the "close tag" callback for the root tag.
Bye, Maarten
Yes, that's what I ended up doing :-) It works great! Thanks
Gabriel