[Python-Dev] http.client Nagle/delayed-ack optimization

Kristján Valur Jónsson kristjan at ccpgames.com
Thu Dec 20 15:08:43 CET 2012


How serendipitous, I was just reporting a similar problem to Sony in one of their console sdks yesterday :)
Indeed, the Nagle problem only shows up if you are sending more than one segments that are not full size.
It will not occur in a sequence of full segments.  Therefore, it is perfectly ok to send the headers + payload as a set of large chunks.
The problem only occurs if sending two or more short segments.  So, if sending even the short headers, followed by the large payload, there is no problem.
The problem exists only if, in addition to the short headers, you are sending the short payload.

In summary:  If the payload is less than the MSS (consider this perhaps 2k) send it along with the headers.  Otherwise, you can go ahead and send the headers, and thepayload (in large chunks if you want) without fear.

See:
http://en.wikipedia.org/wiki/Nagle%27s_algorithm
and
http://en.wikipedia.org/wiki/TCP_delayed_acknowledgment

K

> -----Original Message-----
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou
> Sent: 14. desember 2012 19:27
> To: python-dev at python.org
> Subject: Re: [Python-Dev] http.client Nagle/delayed-ack optimization
> 
> On Sat, 15 Dec 2012 06:17:19 +1100
> Ben Leslie <benno at benno.id.au> wrote:
> > The http.client HTTPConnection._send_output method has an optimization
> > for avoiding bad interactions between delayed-ack and the Nagle
> algorithm:
> >
> > http://hg.python.org/cpython/file/f32f67d26035/Lib/http/client.py#l884
> >
> > Unfortunately this interacts rather poorly if the case where the
> > message_body is a bytes instance and is rather large.
> >
> > If the message_body is bytes it is appended to the headers, which
> > causes a copy of the data. When message_body is large this duplication
> > of data can cause a significant spike in memory usage.
> >
> > (In my particular case I was uploading a 200MB file to 30 hosts at the
> > same leading to memory spikes over 6GB.
> >
> > I've solved this by subclassing and removing the optimization, however
> > I'd appreciate thoughts on how this could best be solved in the library itself.
> >
> > Options I have thought of are:
> >
> > 1: Have some size threshold on the copy. A little bit too much magic.
> > Unclear what the size threshold should be.
> 
> I think a hardcoded threshold is the right thing to do. It doesn't sound very
> useful to try doing a single send() call when you have a large chunk of data
> (say, more than 1 MB).
> 
> Regards
> 
> Antoine.
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-
> dev/kristjan%40ccpgames.com




More information about the Python-Dev mailing list