"High water" Memory fragmentation still a thing?

Dan Stromberg drsalists at gmail.com
Sat Oct 4 21:29:51 CEST 2014


On Fri, Oct 3, 2014 at 1:01 PM, Skip Montanaro <skip.montanaro at gmail.com> wrote:
> On Fri, Oct 3, 2014 at 1:36 PM, Croepha <croepha at gmail.com>
> wrote:
>
>> Long running Python jobs that consume a lot of memory while
>> running may not return that memory to the operating system
>> until the process actually terminates, even if everything is
>> garbage collected properly.

> The problem boils down to how the program dynamically allocates
> and frees memory, and how the malloc subsystem interacts with
> the kernel through the brk and sbrk system calls.  (Anywhere I
> mention "brk", you can mentally replace it with "sbrk". They do
> the same thing - ask for memory from or return memory to the
> kernel - using a different set of units, memory addresses or
> bytes.)  In the general case, programmers call malloc (or
> calloc, or realloc) to allocate a chunk of storage from the
> heap.  (I'm ignoring anything special which Python has layered
> on top of malloc.  It can mitigate problems, but I don't think
> it will fundamentally change the way malloc interacts with the
> kernel.)  The malloc subsystem maintains a free list (recently
> freed bits of memory) from which it can allocate memory without
> traipsing off to the kernel.  If it can't return a chunk of
> memory from the free list, it will (in the most simpleminded
> malloc implementation) call brk to grab a new (large) chunk of
> memory.  The system simply moves the end of the program's
> "break", effectively increasing or decreasing the (virtual) size
> of the running program.  That memory is then doled out to the
> user by malloc.  If, and only if, every chunk of memory in the
> last chunk allocated by a call to brk is placed on malloc's free
> list, *and* if the particular malloc implementation on your box
> is smart enough to coalesce adjacent chunks of freed memory back
> into brk-sized memory chunks, can brk be called once again to
> reduce the program's footprint.

Actually, ISTR hearing that glibc's malloc+free will use mmap+munmap
to allocate and release chunks of memory, to avoid fragmentation.

Digging around on the 'net a bit, it appears that glibc's malloc does
do this (so on most Linux systems), but only for contiguous chunks of
memory above 128K in size.

Here's a pair of demonstration programs (one in C, one in CPython
3.4), which when run under strace on a Linux system, appear to show
that mmap and munmap are being used:
http://stromberg.dnsalias.org/~strombrg/malloc-and-sbrk.html

HTH



More information about the Python-list mailing list