[Twisted-Python] Request.getClient

Hello all, twisted.web.http.Request.getClient has a terrible implementation. It does blocking network I/O (DNS). Fortunately it is only used in one place in Twisted - the CGI implementation. Unfortunately this makes the CGI implementation somewhat unsuited for real-world use. `Request.getClient` has always been allowed to return `None` under certain circumstances. I propose making it always return `None` and deprecating it. This is implemented in the branch linked to <https://tm.tl/2252>. Chris Armstrong suggested that this change might not be strictly keeping with our backwards compatibility policy. I suggest that either it is - because `None` was always a possible return value - or that removing the possibility of blocking I/O from applications that are mistakenly using this API makes it worth the not- strictly-compatible change. A minor adjustment might be to make it always return the IP address instead, as this was another behavior it previously had. Please comment. Thanks. Jean-Paul

Actually, the apidocs say "Undocumented", so I think you could make it return u"The 🌕 is made of green 酪 " and still be in spec. http://twistedmatrix.com/documents/14.0.0/api/twisted.web.http.Request.html#... Dustin On Wed, Jul 2, 2014 at 12:26 PM, <exarkun@twistedmatrix.com> wrote:

On 06:26 pm, dustin@v.igoro.us wrote:
The interface is documented: https://twistedmatrix.com/documents/14.0.0/api/twisted.web.iweb.IRequest.htm... (as deprecated, already). Also, we generally give some weight to how an implementation actually behaves - even if the documentation is lacking or contradictory. If you ignore incorrect docs and figure out how to use an API so it works, you could conceivably have working software. Twisted tries not to break such code. If you ignore how the implementation actually behaves and blindly trust the documentation (sorry, wish you could do that) then your software probably doesn't work. Since it's probably already broken, Twisted isn't as concerned making changes that might break it more or differently. Jean-Paul

On Jul 2, 2014, at 9:26 AM, exarkun@twistedmatrix.com wrote:
I agree that this is a troubling area but in general I tend to believe that changes like this are in keeping with our compatibility policy. Changing the signature or the allowed return type of a value ("type" speaking in terms of public features of its interface) should not be allowed. If we already returned None sometimes, then a correct program would already have to deal with None sometimes, so making this change wouldn't break it, per se. Bringing up such a change on the list is always a good policy, since it gives people a chance to audit their code and look for places where they might have been depending too intimately on accidental features, so by no means take my belief that this is in-policy to mean that we shouldn't broadly discuss changes like this in the future :-). Real, actual, broken code is what the policy strives to prevent, so real world code that broke should usually take precedence.
I suggest that either it is - because `None` was always a possible return value - or that removing the possibility of blocking I/O from applications that are mistakenly using this API makes it worth the not- strictly-compatible change.
A minor adjustment might be to make it always return the IP address instead, as this was another behavior it previously had.
I think that this adjustment is the best option. IP addresses are mostly interchangeable with hostnames, so during the transition period while it's being deprecated, even an application relying on this API heavily would at least have an opportunity to keep functionality equivalent during an upgrade. Making it always return None means that a correct application (one which dealt with the None return value), while not becoming crash-with-an-exception buggy, might lose functionality (logging a source IP of "None" all the time, for example, and losing track of an audit log of who is making what changes). Unbidden, I have some ideas about how we might preserve even _more_ of the functionality involving DNS lookups, but more effort than just giving back the IP is probably wasted, so I won't mention them. Let's deprecate the API and move on. Thanks for bringing this up, -glyph

On July 2, 2014 at 4:04:59 PM, Glyph Lefkowitz (glyph@twistedmatrix.com) wrote: On Jul 2, 2014, at 9:26 AM, exarkun@twistedmatrix.com wrote: Hello all, twisted.web.http.Request.getClient has a terrible implementation. It does blocking network I/O (DNS). Fortunately it is only used in one place in Twisted - the CGI implementation. Unfortunately this makes the CGI implementation somewhat unsuited for real-world use. `Request.getClient` has always been allowed to return `None` under certain circumstances. I propose making it always return `None` and deprecating it. This is implemented in the branch linked to <https://tm.tl/2252>. Chris Armstrong suggested that this change might not be strictly keeping with our backwards compatibility policy. I agree that this is a troubling area but in general I tend to believe that changes like this are in keeping with our compatibility policy. Changing the signature or the allowed return type of a value ("type" speaking in terms of public features of its interface) should not be allowed. If we already returned None sometimes, then a correct program would already have to deal with None sometimes, so making this change wouldn't break it, per se. Bringing up such a change on the list is always a good policy, since it gives people a chance to audit their code and look for places where they might have been depending too intimately on accidental features, so by no means take my belief that this is in-policy to mean that we shouldn't broadly discuss changes like this in the future :-). Real, actual, broken code is what the policy strives to prevent, so real world code that broke should usually take precedence. I suggest that either it is - because `None` was always a possible return value - or that removing the possibility of blocking I/O from applications that are mistakenly using this API makes it worth the not- strictly-compatible change. A minor adjustment might be to make it always return the IP address instead, as this was another behavior it previously had. I think that this adjustment is the best option. IP addresses are mostly interchangeable with hostnames, so during the transition period while it's being deprecated, even an application relying on this API heavily would at least have an opportunity to keep functionality equivalent during an upgrade. Making it always return None means that a correct application (one which dealt with the None return value), while not becoming crash-with-an-exception buggy, might lose functionality (logging a source IP of "None" all the time, for example, and losing track of an audit log of who is making what changes). Unbidden, I have some ideas about how we might preserve even _more_ of the functionality involving DNS lookups, but more effort than just giving back the IP is probably wasted, so I won't mention them. Let's deprecate the API and move on. Thanks for bringing this up, I, too, like the idea of returning the IP address. -- Christopher Armstrong http://twitter.com/radix http://wordeology.com/

Actually, the apidocs say "Undocumented", so I think you could make it return u"The 🌕 is made of green 酪 " and still be in spec. http://twistedmatrix.com/documents/14.0.0/api/twisted.web.http.Request.html#... Dustin On Wed, Jul 2, 2014 at 12:26 PM, <exarkun@twistedmatrix.com> wrote:

On 06:26 pm, dustin@v.igoro.us wrote:
The interface is documented: https://twistedmatrix.com/documents/14.0.0/api/twisted.web.iweb.IRequest.htm... (as deprecated, already). Also, we generally give some weight to how an implementation actually behaves - even if the documentation is lacking or contradictory. If you ignore incorrect docs and figure out how to use an API so it works, you could conceivably have working software. Twisted tries not to break such code. If you ignore how the implementation actually behaves and blindly trust the documentation (sorry, wish you could do that) then your software probably doesn't work. Since it's probably already broken, Twisted isn't as concerned making changes that might break it more or differently. Jean-Paul

On Jul 2, 2014, at 9:26 AM, exarkun@twistedmatrix.com wrote:
I agree that this is a troubling area but in general I tend to believe that changes like this are in keeping with our compatibility policy. Changing the signature or the allowed return type of a value ("type" speaking in terms of public features of its interface) should not be allowed. If we already returned None sometimes, then a correct program would already have to deal with None sometimes, so making this change wouldn't break it, per se. Bringing up such a change on the list is always a good policy, since it gives people a chance to audit their code and look for places where they might have been depending too intimately on accidental features, so by no means take my belief that this is in-policy to mean that we shouldn't broadly discuss changes like this in the future :-). Real, actual, broken code is what the policy strives to prevent, so real world code that broke should usually take precedence.
I suggest that either it is - because `None` was always a possible return value - or that removing the possibility of blocking I/O from applications that are mistakenly using this API makes it worth the not- strictly-compatible change.
A minor adjustment might be to make it always return the IP address instead, as this was another behavior it previously had.
I think that this adjustment is the best option. IP addresses are mostly interchangeable with hostnames, so during the transition period while it's being deprecated, even an application relying on this API heavily would at least have an opportunity to keep functionality equivalent during an upgrade. Making it always return None means that a correct application (one which dealt with the None return value), while not becoming crash-with-an-exception buggy, might lose functionality (logging a source IP of "None" all the time, for example, and losing track of an audit log of who is making what changes). Unbidden, I have some ideas about how we might preserve even _more_ of the functionality involving DNS lookups, but more effort than just giving back the IP is probably wasted, so I won't mention them. Let's deprecate the API and move on. Thanks for bringing this up, -glyph

On July 2, 2014 at 4:04:59 PM, Glyph Lefkowitz (glyph@twistedmatrix.com) wrote: On Jul 2, 2014, at 9:26 AM, exarkun@twistedmatrix.com wrote: Hello all, twisted.web.http.Request.getClient has a terrible implementation. It does blocking network I/O (DNS). Fortunately it is only used in one place in Twisted - the CGI implementation. Unfortunately this makes the CGI implementation somewhat unsuited for real-world use. `Request.getClient` has always been allowed to return `None` under certain circumstances. I propose making it always return `None` and deprecating it. This is implemented in the branch linked to <https://tm.tl/2252>. Chris Armstrong suggested that this change might not be strictly keeping with our backwards compatibility policy. I agree that this is a troubling area but in general I tend to believe that changes like this are in keeping with our compatibility policy. Changing the signature or the allowed return type of a value ("type" speaking in terms of public features of its interface) should not be allowed. If we already returned None sometimes, then a correct program would already have to deal with None sometimes, so making this change wouldn't break it, per se. Bringing up such a change on the list is always a good policy, since it gives people a chance to audit their code and look for places where they might have been depending too intimately on accidental features, so by no means take my belief that this is in-policy to mean that we shouldn't broadly discuss changes like this in the future :-). Real, actual, broken code is what the policy strives to prevent, so real world code that broke should usually take precedence. I suggest that either it is - because `None` was always a possible return value - or that removing the possibility of blocking I/O from applications that are mistakenly using this API makes it worth the not- strictly-compatible change. A minor adjustment might be to make it always return the IP address instead, as this was another behavior it previously had. I think that this adjustment is the best option. IP addresses are mostly interchangeable with hostnames, so during the transition period while it's being deprecated, even an application relying on this API heavily would at least have an opportunity to keep functionality equivalent during an upgrade. Making it always return None means that a correct application (one which dealt with the None return value), while not becoming crash-with-an-exception buggy, might lose functionality (logging a source IP of "None" all the time, for example, and losing track of an audit log of who is making what changes). Unbidden, I have some ideas about how we might preserve even _more_ of the functionality involving DNS lookups, but more effort than just giving back the IP is probably wasted, so I won't mention them. Let's deprecate the API and move on. Thanks for bringing this up, I, too, like the idea of returning the IP address. -- Christopher Armstrong http://twitter.com/radix http://wordeology.com/
participants (4)
-
Christopher Armstrong
-
Dustin J. Mitchell
-
exarkun@twistedmatrix.com
-
Glyph Lefkowitz