[Twisted-Python] NetstringReceiver

Hi! The NetstringReceiver class does not exactly do what I'd expect - see issue http://twistedmatrix.com/trac/ticket/4378. Since we plan to use netstrings, I am trying to fix it. I am not quite sure how to handle certain problems, though, and that's where I'd appreciate help. If this is the wrong place to ask this kind of questions, please redirect me ;-). 1.) In the current implementation, a NetstringParseError results in a call to self.transport.loseConnection(). Furthermore the variable brokenPeer is set to 1. But this variable is ignored. Would it be better to check it before receiving data? If yes: what is the correct way for a protocol to handle a broken transport? There is a ConnectionLost exception, but I'm not sure if / when a Protocol is allowed to raise it. 2.) The current implementation uses the method dataReceived to assemble the content of the netstring. When it is complete, the method stringReceived (which raises a NotImplementedError by default) is called. But netstrings can be long - would it be better to create a Deferred instance and call its callback instead? What would be the right place to create this Deferred? 3.) I started testing the NetstringReceiver by creating a class comment with doctests. When I later grepped the sources, I found some doctests, but there no mention of the doctest module in the test policy document. Would it be better to use the unittest framework for testing the code? Thanks and best regards, -- Albert Brandl Weiermayer Solutions GmbH 4813 Altmünster | Abteistraße 12 | Austria Fon: +43(0)720 70 30 14 | Fax: +43 (0) 7612 88081

Albert Brandl wrote:
Hi!
The NetstringReceiver class does not exactly do what I'd expect - see issue http://twistedmatrix.com/trac/ticket/4378. Since we plan to use netstrings, I am trying to fix it.
I am not quite sure how to handle certain problems, though, and that's where I'd appreciate help. If this is the wrong place to ask this kind of questions, please redirect me ;-).
1.) In the current implementation, a NetstringParseError results in a call to self.transport.loseConnection(). Furthermore the variable brokenPeer is set to 1. But this variable is ignored. Would it be better to check it before receiving data? If yes: what is the correct way for a protocol to handle a broken transport? There is a ConnectionLost exception, but I'm not sure if / when a Protocol is allowed to raise it.
No; I think once loseConnection has been called dataReceived won't be called again, so there's no point checking for brokenPeer in dataReceived. It might be worth checking in sendString, but I would expect that it would already raise an appropriate exception because it will try to write to a closed transport.
2.) The current implementation uses the method dataReceived to assemble the content of the netstring. When it is complete, the method stringReceived (which raises a NotImplementedError by default) is called. But netstrings can be long - would it be better to create a Deferred instance and call its callback instead? What would be the right place to create this Deferred?
No. I'm rather confused about why you're even asking this question. Exactly what API are you proposing as an alternative to the existing API (which is subclass NetstringReceiver and override stringReceived, and stringReceived will be called for every netstring that is received, as soon as it is received)?
3.) I started testing the NetstringReceiver by creating a class comment with doctests. When I later grepped the sources, I found some doctests, but there no mention of the doctest module in the test policy document. Would it be better to use the unittest framework for testing the code?
Yes please. There are some existing tests for NetstringReceiver in twisted/test/test_protocols.py. -Andrew.

On Thu, Mar 25, 2010 at 09:22:03AM +1100, Andrew Bennetts wrote:
I'm rather confused about why you're even asking this question. Exactly what API are you proposing as an alternative to the existing API (which is subclass NetstringReceiver and override stringReceived, and stringReceived will be called for every netstring that is received, as soon as it is received)?
Never mind - I didn't understand the concept of deferreds ;-). When I read the code for FileSender, I thought it might be cool to have something similar for the NetstringReceiver. But in the case of FileSender, the delivery of a file is explicitly triggered by the client. My vague idea was to create a method beginNetstringReception that returns a Deferred. But since Deferred instances are not reusable, I'd have to call this method each time I wanted to receive a netstring, and this is obviously nonsense. Regards, -- DI Albert Brandl Weiermayer Solutions GmbH 4813 Altmünster | Abteistraße 12 | Austria Fon: +43(0)720 70 30 14 | Fax: +43 (0) 7612 88081

On 08:36 am, albert.brandl@weiermayer.com wrote:
On Thu, Mar 25, 2010 at 09:22:03AM +1100, Andrew Bennetts wrote:
I'm rather confused about why you're even asking this question. Exactly what API are you proposing as an alternative to the existing API (which is subclass NetstringReceiver and override stringReceived, and stringReceived will be called for every netstring that is received, as soon as it is received)?
Never mind - I didn't understand the concept of deferreds ;-). When I read the code for FileSender, I thought it might be cool to have something similar for the NetstringReceiver. But in the case of FileSender, the delivery of a file is explicitly triggered by the client.
My vague idea was to create a method beginNetstringReception that returns a Deferred. But since Deferred instances are not reusable, I'd have to call this method each time I wanted to receive a netstring, and this is obviously nonsense.
You might want a producer/consumer-based API. Jean-Paul

On Thu, Mar 25, 2010 at 09:22:03AM +1100, Andrew Bennetts wrote:
No; I think once loseConnection has been called dataReceived won't be called again, so there's no point checking for brokenPeer in dataReceived.
There is a unittest that assumes that dataReceives still works after sending in garbage: def test_illegal(self): """ Assert that illegal strings cause the transport to be closed. """ for s in self.illegalStrings: r = self.getProtocol() for c in s: r.dataReceived(c) self.assertTrue(r.transport.disconnecting) self.illegalStrings is defined as illegalStrings = [ '9999999999999999999999', 'abc', '4:abcde', '51:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab,',] In this implementation, the dataReceived method has to be functional even after the NetstringReceiver has detected that the received data are garbage. And there is no direct information that the received string was no valid netstring - you have to check for r.transport.disconnecting. That's why I thought it might be a good idea to raise an appropriate exception if the receiver is no longer functional. I use a generator- based approach to parsing the netstring (for details see http://twistedmatrix.com/trac/attachment/ticket/4378/ticket_4378.patch), and the generator raises a StopIteration exception. But I'd prefer to raise an exception that explains what happened at a higher level - that's why I asked if there was some predefined exception that would match this kind of problem.
There are some existing tests for NetstringReceiver in twisted/test/test_protocols.py.
Thanks, this was very helpful. Regards, -- DI Albert Brandl Weiermayer Solutions GmbH 4813 Altmünster | Abteistraße 12 | Austria Fon: +43(0)720 70 30 14 | Fax: +43 (0) 7612 88081

On 10:25 am, albert.brandl@weiermayer.com wrote:
On Thu, Mar 25, 2010 at 09:22:03AM +1100, Andrew Bennetts wrote:
No; I think once loseConnection has been called dataReceived won't be called again, so there's no point checking for brokenPeer in dataReceived.
There is a unittest that assumes that dataReceives still works after sending in garbage:
def test_illegal(self): """ Assert that illegal strings cause the transport to be closed. """ for s in self.illegalStrings: r = self.getProtocol() for c in s: r.dataReceived(c) self.assertTrue(r.transport.disconnecting)
self.illegalStrings is defined as
illegalStrings = [ '9999999999999999999999', 'abc', '4:abcde', '51:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab,',]
In this implementation, the dataReceived method has to be functional even after the NetstringReceiver has detected that the received data are garbage.
I'm not sure about this. Each iteration around this loop creates a new instance of NetstringReceiver; after the illegal string is delivered, the instance isn't used again. So, are you sure this is testing what you think it's testing? Jean-Paul

exarkun@twistedmatrix.com wrote:
On 10:25 am, albert.brandl@weiermayer.com wrote: [...]
There is a unittest that assumes that dataReceives still works after sending in garbage:
def test_illegal(self): """ Assert that illegal strings cause the transport to be closed. """ for s in self.illegalStrings: r = self.getProtocol() for c in s: r.dataReceived(c) self.assertTrue(r.transport.disconnecting)
self.illegalStrings is defined as
illegalStrings = [ '9999999999999999999999', 'abc', '4:abcde', '51:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab,',]
In this implementation, the dataReceived method has to be functional even after the NetstringReceiver has detected that the received data are garbage.
I'm not sure about this. Each iteration around this loop creates a new instance of NetstringReceiver; after the illegal string is delivered, the instance isn't used again.
That's true — but each illegal string is delivered one byte at a time. So the 'abc' string will trigger a NetstringParseError on the first dataReceived('a'), but then the test expects the subsequent dataReceived('b') and dataReceived('c') to execute without raising. -Andrew.

On Fri, Mar 26, 2010 at 12:08:02PM -0000, exarkun@twistedmatrix.com wrote:
So, are you sure this is testing what you think it's testing?
Yes. This test was already there, and it failed in unexpected ways, so I had to look at it quite closely. I'm not too happy about it, since it tests several things at once (illegal strings, bytewise delivery), but I decided to leave it there for now, just to make sure that I don't miss some test case with the new unit tests. As Andrew pointed out, this unittest assumes that dataReceived is still (somewhat) functional after the detection of an invalid netstring. This does not make much sense IMO, and it complicates the code. I'd prefer to raise an exception and adapt the unittest accordingly. But I still don't know which exception would be best. There is a ConnectionLost exception, which is raised in twisted.protocols.amp.BinaryBoxProtocol when the transport is None. Other options are ConnectionClosed and ConnectionDone (both are not raised directly anywhere), or a new one. Maybe ConnectionLost is the most appropriate one. Best regards, Albert -- DI Albert Brandl Weiermayer Solutions GmbH | http://www.weiermayer.com 4813 Altmünster | Abteistraße 12 | Austria Fon: +43(0)720 70 30 14 | Fax: +43 (0) 7612 88081

On 11:11 am, albert.brandl@weiermayer.com wrote:
On Fri, Mar 26, 2010 at 12:08:02PM -0000, exarkun@twistedmatrix.com wrote:
So, are you sure this is testing what you think it's testing?
Yes. This test was already there, and it failed in unexpected ways, so I had to look at it quite closely. I'm not too happy about it, since it tests several things at once (illegal strings, bytewise delivery), but I decided to leave it there for now, just to make sure that I don't miss some test case with the new unit tests.
As Andrew pointed out, this unittest assumes that dataReceived is still (somewhat) functional after the detection of an invalid netstring. This does not make much sense IMO, and it complicates the code. I'd prefer to raise an exception and adapt the unittest accordingly.
I see. A good first step (separate from fixing any of the implementation of NetstringReceiver) might be to clean up the tests, then, so that each only exercises one case, and so that only the desired behavior is tested (for example, tests for the behavior you're talking about, of receiving invalid bytes after the connection has been closed, could be removed).
But I still don't know which exception would be best. There is a ConnectionLost exception, which is raised in twisted.protocols.amp.BinaryBoxProtocol when the transport is None. Other options are ConnectionClosed and ConnectionDone (both are not raised directly anywhere), or a new one. Maybe ConnectionLost is the most appropriate one.
Feel free to introduce new exception types. Unfortunately I don't remember what change you're actually proposing to make to NetstringReceiver anymore, so I won't try to suggest any names. Jean-Paul
participants (3)
-
Albert Brandl
-
Andrew Bennetts
-
exarkun@twistedmatrix.com