[Python-Dev] Darwin's realloc(...) implementation never shrinks
allocations
Bob Ippolito
bob at redivi.com
Mon Jan 3 21:55:19 CET 2005
On Jan 3, 2005, at 3:23 PM, bacchusrx wrote:
> On Thu, Jan 01, 1970 at 12:00:00AM +0000, Tim Peters wrote:
>> Is there any known case where Python performs poorly on this OS, for
>> this reason, other than the "pass giant numbers to recv() and then
>> shrink the string because we didn't get anywhere near that many bytes"
>> case?
>>
>> [...]
>>
>> I agree the socket-abuse case should be fiddled, and for more reasons
>> than just Darwin's realloc() quirks. [...] Yes, in the socket-abuse
>> case, where the program routinely malloc()s strings millions of bytes
>> larger than the socket can deliver, it would obviously help. That's
>> not typically program behavior (however typical it may be of that
>> specific app).
>
> Note that, with respect to http://python.org/sf/1092502, the author of
> the (original) program was using the documented interface to a file
> object. It's _fileobject.read() that decides to ask for huge numbers
> of
> bytes from recv() (specifically, in the max(self._rbufsize, left)
> condition). Patched to use a fixed recv_size, you of course sidestep
> the
> realloc() nastiness in this particular case.
While using a reasonably sized recv_size is a good idea, using a
smaller request size simply means that it's less likely that the
strings will be significantly resized. It is still highly likely they
*will* be resized and that doesn't solve the problem that
over-allocated strings will persist until the entire request is
fulfilled.
For example, receiving 1 byte chunks (if that's even possible) would
exacerbate the issue even for a small request size. If you asked for 8
MB with a request size of 1024 bytes, and received it in 1 byte chunks,
you would need a minimum of an impossible ~16 GB to satisfy that
request (minimum ~8 GB to collect the strings, minimum ~8 GB to
concatenate them) as opposed to the Python-optimal case of ~16 MB when
always using compact representations.
Using cStringIO instead of a list of potentially over-allocated strings
would actually have such Python-optimal memory usage characteristics on
all platforms.
-bob
More information about the Python-Dev
mailing list