
Hi all, I'm new in Python development. I'm interesting in the new TCP Fast Open protocol (http://research.google.com/pubs/pub37517.html). This protocol is implemented in linux kernel 3.6 for client and 3.7 for server, and in python changeset 5435a9278028 are defined the related constants. This TCP change is an important optimization, in particular for http, and it is completely backward compatible: even if a client or a server doesn't support TFO, the connection proceed with normal procedure. I think can be useful an implementation in socketserver module: an attribute "allow_tcp_fast_open" that automatically set before listening the correct socket option (another attribute is necessary to choose the queue size). Similar implementation can be done in http modules. The default value of this attribute may be "True" (according to its backward compatibility), but new versions of glibc might expose TCP_FASTOPEN costant even if the kernel does not support it (so use hasattr to check if the constant exists don't guarantee that TFO is supported by kernel). Maybe more complex code can resolve this problem, but I don't know how do that (maybe catching exception or checking kernel version?) I attached the simple patch for socketserver (and doc), let me know what you think! Federico

Hi! On Thu, Jan 10, 2013 at 09:44:54PM +0100, Federico Reghenzani <federico.dev@reghe.net> wrote:
I attached the simple patch for socketserver (and doc), let me know what you think!
The patch looks good at the first glance, thank you for the work! The better place for patches is the issue tracker at http://bugs.python.org -- patches in the mailing list tend to be lost. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

Hi Oleg, I've posted here because I'm asking if it may be an idea make some changes also in http module, maybe setting that option on 'True' as default (but first we need to fix the kernel-glibc problem). Thanks, Federico

On Thu, Jan 10, 2013 at 10:06:21PM +0100, Federico Reghenzani <federico.dev@reghe.net> wrote:
I think IWBN to patch as many network modules as (ftplib, urllib, urllib2, xmlrpclib). Having tests also helps. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Jan 10, 2013, at 10:24 PM, Guido van Rossum <guido@python.org> wrote:
Is there sample code for an HTTP client? What if the server doesn't yet support the feature?
Like I read it, this is transparent for the application if it doesn't support it. https://lwn.net/Articles/508865/ - benoît

On 1/10/2013 4:29 PM, Benoit Chesneau wrote:
I read both the post (Aug 1, 2012, before the Linux 3.7 with the server code) and comments. FastOpen appears to still be an experimental proposal: "Currently, TFO is an Internet Draft with the IETF. ... (The current implementation employs the TCP Experimental Option Number facility as a placeholder for a real TCP Option Number.)". From the comments, I would say that its success outside of Google is not certain. It appears that its main use case is repeated requests to webservers from browswers. This is because the latter often make *multiple* requests, often short, to the same site in order to construct a displayed web page. There is no time saving on the first request of a series. I suspect that after Google updates Chrome to use the new feature, one of the other 'independent' browsers is likely to be the next user. To be active, the feature must be compiled into the socket code of both server and client machines AND must be explicitly requested by both client and server applications. On the server side, it must be requested because the request makes a promise that syn+data requests will be handled idempotently. (So the default should be 'off'.) This is trivial for static web pages but may require app-specific overhead for anything else. So, in general, the app should not bother being able to handle FastOpen unless it will be run on servers with FastOpen, and for efficiency, it should not add the overhead unless it is needed because a particular request is from a FastOpen client. This is not a problem for Google, with thousands of duplicate apps running on duplicate server configurations. But it was not clear in the OPs post how a Python app would know for sure whether a particular machine is FastOpen capable. I did not see the question of how a server app would know about the client connection type even addressed. On the client side, .connect and at least the first .send must be combined into either .sendto or .sendmsg (which?, still to be decided, apparently;-) with a new MSG_FASTOPEN argument. So programs need a non-trivial rewrite. If a particular server is not fastopen capable, then new fastopen client kernal socket code can potentially handle the fallback to the old way. But if the client is not fastopen capable, the the fallback must be handled in the Python .sendto code or else in the client code. (So one of those layers must *know* the client system capability.) Again, dealing with this, on multiple OSes, should be a lot easier for a monolithic browser like Chrome or Firefox (which might, on some systems, even use their own socket layer code), than for general purpose Python socket and app code. So my conclusion is that this is (mostly) premature for Python at this time. This is a slight performance enhancement of limited use that will make code at least slightly more complex in a core module that must be keep at least as rock solid as it is now. Let Google get it working on both their servers and Chrome browser. And wait for Mozilla, say, to add it to Firefox. Things might change before the first 3.4 beta, but I think 3.5 is more likely. Of course, testing will require all 4 combinations of client and server. -- Terry Jan Reedy

On Fri, Jan 11, 2013 at 3:45 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Yes, the protocol has been designed for situations where there are multiple requests such as HTTP or FTP. Probably only in these cases default 'True' option is appropriate.
If the server doesn't support FastOpen and receive a FastOpen request from a client capable, it simply ignores the TFO cookie and reply with a normal SYN+ACK. In this case the first packet (SYN+TFO from client) is only 4 byte larger than normal connection; no other packet is bigger than normal. So for an server app that does not support FastOpen, is completely transparent and does not cause any overhead.
The server know the client connection type by the first packet that it sends: if the first packet coming by client is a SYN+TFO cookie the server proceed to generate cookie and continue with a FastOpen connection, if the first packet is a SYN, the server proceed with normal 3-handshake connection. In any case these operations are transparent both to Python that application because they're made by kernel.
As I said, if a client uses a .sendto or a .sendmsg with MSG_FASTOPEN on a server no-tfo capable, the linux kernel fallback to the old way, therefore it is as if it has done normal .connect and .send. The application don't know if the connection has been made in TFO-mode or normal mode and does not care to know.
We can introduce TFO only in some modules such as HTTP or FTP. The code is not really complex: for the server is only a .setsockopt before .listen and for the client we should replace the .connect and the first .send with a single .sendto or .sendmsg. On Jan 10, 2013, at 10:46 PM, Guido van Rossum:
Hopefully the OP has some sample Python code? Yes, it is pratically same as C, I attached examples (I needed to declare manually TCP and MSG constants because my glibc hasn't them yet). Federico Reghenzani

On Jan 11, 2013, at 8:30 AM, Federico Reghenzani <federico.dev@reghe.net> wrote:
For expetimentation I added a patch to gunicorn in the `featire/tcp_fast` branch: https://github.com/benoitc/gunicorn/pull/471 I expect to do the same in my restkit (http client lib) so i can test all together. So far this API can be interesting for internal purpose as well. - benoît

On 1/11/13, Federico Reghenzani <federico.dev@reghe.net> wrote:
On Fri, Jan 11, 2013 at 3:45 AM, Terry Reedy <tjreedy@udel.edu> wrote:
What is the harm of using it in other situations? If the answer were truly just "4 bytes per host", then it might still be a good tradeoff.
This, however, is a problem. Based on (most of) the rest of your descriptions, it sounds like a seamless drop-in replacement; it should be an implementation detail that applications never ever notice, like having a security patch applied to the operating system when python isn't even running. But if that were true, an explicit request would be overly cautious, unless this were truly still so experimental that production servers (and, thus, the python distribution in a default build) should not yet use it. Also note that if it isn't available on Windows (and probably even on Windows XP without additional dependencies), Python can't yet rely on it. Below, you also say that it is not appropriate for servers unless syn+data is idempotent -- but I don't know even what that means without looking it up, let alone whether it is true of my app -- so it sounds like a bug magnet.
So how is this a python issue at all? Because of the explicit request? Because of the need to keep something idempotent? I see no harm in letting open accept and pass through additional optional arguments, or in a generic way to query the kernel for its extensions, but if you need something specific to this particular extension, then please do it as an external package first.
Application programs, or just the plumbing in the httplib? -jJ

On 11 Jan, 2013, at 9:50, Jim Jewett <jimjjewett@gmail.com> wrote:
It must be explictly requested by the server because the behavior might change, in particular the lwn.net page about this feature mentions that duplicate SYN messages are not detected, and if I parse that page correctly that might mean that the servers gets two or more requests when the connection is unreliable (or slow) and retransmission happens. That is fine for static webpages, but not if the client request has side effects (e.g. the server starts updates a database). BTW. This (linux-only) feature is very new, it would IMHO be useful to use this in real life with a package on PyPI that monkeypatches the stdlib before adding the feature to the stdlib. It is currently not clear if the option will be usefull in the long run. Ronald

On 11.01.2013 03:45, Terry Reedy wrote:
Agreed. I also wonder how this relates to HTTP pipelining, a feature to improve the same multiple-requests-to-one-server situation. Pipelining has been implemented for years both on clients and servers, yet it is still turned off per default in e.g. Firefox: http://en.wikipedia.org/wiki/HTTP_pipelining There's also HTTP 2.0 on the horizon, so it may be better to what which of those technologies actually gets enough use in practice, before adding support to the Python library. That said, it may be useful to have a PyPI package which implements the FastOpen protocol in a separate socket implementation (which can then monkey itself into the stdlib, if the application developer wants this). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 11 2013)
2013-01-22: Python Meeting Duesseldorf ... 11 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Fri, Jan 11, 2013 at 6:02 PM, M.-A. Lemburg <mal@egenix.com> wrote:
TCP Fast Open should be supported in client code directly, it's not enough to have socket() supporting it. It's not up to socket() implementation. Server-side is pretty simple, so to say "Python supports TCP_FASTOPEN" there should be support implemented for each (or most) client libraries in stdlib, such as almost every module in http://docs.python.org/3/library/internet.html Monkey-patching all these modules (or their connect() parts) is not very clean way, I think. -- Kind regards, Yuriy.

On 11.01.2013 21:03, Yuriy Taraday wrote:
Right, the new methods would have to be used by the application.
Of course not, but it's viable way to test drive such an implementation before putting the code directly into the stdlib modules. gevent uses the same approach, BTW. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 11 2013)
2013-01-22: Python Meeting Duesseldorf ... 11 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

So, again. Has *anyone* actually written *any* working Python code for this? -- --Guido van Rossum (python.org/~guido)

On Thu, Jan 10, 2013 at 01:24:56PM -0800, Guido van Rossum <guido@python.org> wrote:
Is there sample code for an HTTP client? What if the server doesn't yet support the feature?
AFAIU the feature is implemented at the kernel level and doesn't require any change at the user level, only a socket option. If the server doesn't implement the feature the kernel on the client side transparently (to the client) reverts to normal 3-way TCP handshaking. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Fri, Jan 11, 2013 at 01:32:38AM +0400, Oleg Broytman <phd@phdru.name> wrote:
Sorry, I was completely confused. Yes, clients need different calls: https://lwn.net/Articles/508865/ Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Thu, Jan 10, 2013 at 1:34 PM, Oleg Broytman <phd@phdru.name> wrote:
Right, that's what I gleaned from skimming the referenced paper. But that and the lwn article you link only show C code. Let's see some Python! (I would try it, but no machine I have access to supports this yet.) Hopefully the OP has some sample Python code? Otherwise I think it's a little too early to adopt this... -- --Guido van Rossum (python.org/~guido)

Hi! On Thu, Jan 10, 2013 at 09:44:54PM +0100, Federico Reghenzani <federico.dev@reghe.net> wrote:
I attached the simple patch for socketserver (and doc), let me know what you think!
The patch looks good at the first glance, thank you for the work! The better place for patches is the issue tracker at http://bugs.python.org -- patches in the mailing list tend to be lost. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

Hi Oleg, I've posted here because I'm asking if it may be an idea make some changes also in http module, maybe setting that option on 'True' as default (but first we need to fix the kernel-glibc problem). Thanks, Federico

On Thu, Jan 10, 2013 at 10:06:21PM +0100, Federico Reghenzani <federico.dev@reghe.net> wrote:
I think IWBN to patch as many network modules as (ftplib, urllib, urllib2, xmlrpclib). Having tests also helps. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Jan 10, 2013, at 10:24 PM, Guido van Rossum <guido@python.org> wrote:
Is there sample code for an HTTP client? What if the server doesn't yet support the feature?
Like I read it, this is transparent for the application if it doesn't support it. https://lwn.net/Articles/508865/ - benoît

On 1/10/2013 4:29 PM, Benoit Chesneau wrote:
I read both the post (Aug 1, 2012, before the Linux 3.7 with the server code) and comments. FastOpen appears to still be an experimental proposal: "Currently, TFO is an Internet Draft with the IETF. ... (The current implementation employs the TCP Experimental Option Number facility as a placeholder for a real TCP Option Number.)". From the comments, I would say that its success outside of Google is not certain. It appears that its main use case is repeated requests to webservers from browswers. This is because the latter often make *multiple* requests, often short, to the same site in order to construct a displayed web page. There is no time saving on the first request of a series. I suspect that after Google updates Chrome to use the new feature, one of the other 'independent' browsers is likely to be the next user. To be active, the feature must be compiled into the socket code of both server and client machines AND must be explicitly requested by both client and server applications. On the server side, it must be requested because the request makes a promise that syn+data requests will be handled idempotently. (So the default should be 'off'.) This is trivial for static web pages but may require app-specific overhead for anything else. So, in general, the app should not bother being able to handle FastOpen unless it will be run on servers with FastOpen, and for efficiency, it should not add the overhead unless it is needed because a particular request is from a FastOpen client. This is not a problem for Google, with thousands of duplicate apps running on duplicate server configurations. But it was not clear in the OPs post how a Python app would know for sure whether a particular machine is FastOpen capable. I did not see the question of how a server app would know about the client connection type even addressed. On the client side, .connect and at least the first .send must be combined into either .sendto or .sendmsg (which?, still to be decided, apparently;-) with a new MSG_FASTOPEN argument. So programs need a non-trivial rewrite. If a particular server is not fastopen capable, then new fastopen client kernal socket code can potentially handle the fallback to the old way. But if the client is not fastopen capable, the the fallback must be handled in the Python .sendto code or else in the client code. (So one of those layers must *know* the client system capability.) Again, dealing with this, on multiple OSes, should be a lot easier for a monolithic browser like Chrome or Firefox (which might, on some systems, even use their own socket layer code), than for general purpose Python socket and app code. So my conclusion is that this is (mostly) premature for Python at this time. This is a slight performance enhancement of limited use that will make code at least slightly more complex in a core module that must be keep at least as rock solid as it is now. Let Google get it working on both their servers and Chrome browser. And wait for Mozilla, say, to add it to Firefox. Things might change before the first 3.4 beta, but I think 3.5 is more likely. Of course, testing will require all 4 combinations of client and server. -- Terry Jan Reedy

On Fri, Jan 11, 2013 at 3:45 AM, Terry Reedy <tjreedy@udel.edu> wrote:
Yes, the protocol has been designed for situations where there are multiple requests such as HTTP or FTP. Probably only in these cases default 'True' option is appropriate.
If the server doesn't support FastOpen and receive a FastOpen request from a client capable, it simply ignores the TFO cookie and reply with a normal SYN+ACK. In this case the first packet (SYN+TFO from client) is only 4 byte larger than normal connection; no other packet is bigger than normal. So for an server app that does not support FastOpen, is completely transparent and does not cause any overhead.
The server know the client connection type by the first packet that it sends: if the first packet coming by client is a SYN+TFO cookie the server proceed to generate cookie and continue with a FastOpen connection, if the first packet is a SYN, the server proceed with normal 3-handshake connection. In any case these operations are transparent both to Python that application because they're made by kernel.
As I said, if a client uses a .sendto or a .sendmsg with MSG_FASTOPEN on a server no-tfo capable, the linux kernel fallback to the old way, therefore it is as if it has done normal .connect and .send. The application don't know if the connection has been made in TFO-mode or normal mode and does not care to know.
We can introduce TFO only in some modules such as HTTP or FTP. The code is not really complex: for the server is only a .setsockopt before .listen and for the client we should replace the .connect and the first .send with a single .sendto or .sendmsg. On Jan 10, 2013, at 10:46 PM, Guido van Rossum:
Hopefully the OP has some sample Python code? Yes, it is pratically same as C, I attached examples (I needed to declare manually TCP and MSG constants because my glibc hasn't them yet). Federico Reghenzani

On Jan 11, 2013, at 8:30 AM, Federico Reghenzani <federico.dev@reghe.net> wrote:
For expetimentation I added a patch to gunicorn in the `featire/tcp_fast` branch: https://github.com/benoitc/gunicorn/pull/471 I expect to do the same in my restkit (http client lib) so i can test all together. So far this API can be interesting for internal purpose as well. - benoît

On 1/11/13, Federico Reghenzani <federico.dev@reghe.net> wrote:
On Fri, Jan 11, 2013 at 3:45 AM, Terry Reedy <tjreedy@udel.edu> wrote:
What is the harm of using it in other situations? If the answer were truly just "4 bytes per host", then it might still be a good tradeoff.
This, however, is a problem. Based on (most of) the rest of your descriptions, it sounds like a seamless drop-in replacement; it should be an implementation detail that applications never ever notice, like having a security patch applied to the operating system when python isn't even running. But if that were true, an explicit request would be overly cautious, unless this were truly still so experimental that production servers (and, thus, the python distribution in a default build) should not yet use it. Also note that if it isn't available on Windows (and probably even on Windows XP without additional dependencies), Python can't yet rely on it. Below, you also say that it is not appropriate for servers unless syn+data is idempotent -- but I don't know even what that means without looking it up, let alone whether it is true of my app -- so it sounds like a bug magnet.
So how is this a python issue at all? Because of the explicit request? Because of the need to keep something idempotent? I see no harm in letting open accept and pass through additional optional arguments, or in a generic way to query the kernel for its extensions, but if you need something specific to this particular extension, then please do it as an external package first.
Application programs, or just the plumbing in the httplib? -jJ

On 11 Jan, 2013, at 9:50, Jim Jewett <jimjjewett@gmail.com> wrote:
It must be explictly requested by the server because the behavior might change, in particular the lwn.net page about this feature mentions that duplicate SYN messages are not detected, and if I parse that page correctly that might mean that the servers gets two or more requests when the connection is unreliable (or slow) and retransmission happens. That is fine for static webpages, but not if the client request has side effects (e.g. the server starts updates a database). BTW. This (linux-only) feature is very new, it would IMHO be useful to use this in real life with a package on PyPI that monkeypatches the stdlib before adding the feature to the stdlib. It is currently not clear if the option will be usefull in the long run. Ronald

On 11.01.2013 03:45, Terry Reedy wrote:
Agreed. I also wonder how this relates to HTTP pipelining, a feature to improve the same multiple-requests-to-one-server situation. Pipelining has been implemented for years both on clients and servers, yet it is still turned off per default in e.g. Firefox: http://en.wikipedia.org/wiki/HTTP_pipelining There's also HTTP 2.0 on the horizon, so it may be better to what which of those technologies actually gets enough use in practice, before adding support to the Python library. That said, it may be useful to have a PyPI package which implements the FastOpen protocol in a separate socket implementation (which can then monkey itself into the stdlib, if the application developer wants this). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 11 2013)
2013-01-22: Python Meeting Duesseldorf ... 11 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Fri, Jan 11, 2013 at 6:02 PM, M.-A. Lemburg <mal@egenix.com> wrote:
TCP Fast Open should be supported in client code directly, it's not enough to have socket() supporting it. It's not up to socket() implementation. Server-side is pretty simple, so to say "Python supports TCP_FASTOPEN" there should be support implemented for each (or most) client libraries in stdlib, such as almost every module in http://docs.python.org/3/library/internet.html Monkey-patching all these modules (or their connect() parts) is not very clean way, I think. -- Kind regards, Yuriy.

On 11.01.2013 21:03, Yuriy Taraday wrote:
Right, the new methods would have to be used by the application.
Of course not, but it's viable way to test drive such an implementation before putting the code directly into the stdlib modules. gevent uses the same approach, BTW. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 11 2013)
2013-01-22: Python Meeting Duesseldorf ... 11 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

So, again. Has *anyone* actually written *any* working Python code for this? -- --Guido van Rossum (python.org/~guido)

On Thu, Jan 10, 2013 at 01:24:56PM -0800, Guido van Rossum <guido@python.org> wrote:
Is there sample code for an HTTP client? What if the server doesn't yet support the feature?
AFAIU the feature is implemented at the kernel level and doesn't require any change at the user level, only a socket option. If the server doesn't implement the feature the kernel on the client side transparently (to the client) reverts to normal 3-way TCP handshaking. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Fri, Jan 11, 2013 at 01:32:38AM +0400, Oleg Broytman <phd@phdru.name> wrote:
Sorry, I was completely confused. Yes, clients need different calls: https://lwn.net/Articles/508865/ Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Thu, Jan 10, 2013 at 1:34 PM, Oleg Broytman <phd@phdru.name> wrote:
Right, that's what I gleaned from skimming the referenced paper. But that and the lwn article you link only show C code. Let's see some Python! (I would try it, but no machine I have access to supports this yet.) Hopefully the OP has some sample Python code? Otherwise I think it's a little too early to adopt this... -- --Guido van Rossum (python.org/~guido)
participants (9)
-
Benoit Chesneau
-
Federico Reghenzani
-
Guido van Rossum
-
Jim Jewett
-
M.-A. Lemburg
-
Oleg Broytman
-
Ronald Oussoren
-
Terry Reedy
-
Yuriy Taraday