On Sat, 15 Dec 2012 06:17:19 +1100 Ben Leslie firstname.lastname@example.org wrote:
The http.client HTTPConnection._send_output method has an optimization for avoiding bad interactions between delayed-ack and the Nagle algorithm:
Unfortunately this interacts rather poorly if the case where the message_body is a bytes instance and is rather large.
If the message_body is bytes it is appended to the headers, which causes a copy of the data. When message_body is large this duplication of data can cause a significant spike in memory usage.
(In my particular case I was uploading a 200MB file to 30 hosts at the same leading to memory spikes over 6GB.
I've solved this by subclassing and removing the optimization, however I'd appreciate thoughts on how this could best be solved in the library itself.
Options I have thought of are:
1: Have some size threshold on the copy. A little bit too much magic. Unclear what the size threshold should be.
I think a hardcoded threshold is the right thing to do. It doesn't sound very useful to try doing a single send() call when you have a large chunk of data (say, more than 1 MB).