
On Oct 19, 2004, at 8:32, Luis P Caamano wrote:
This is such a big problem for us that we had to rewrite some of our daemons to fork request handlers so that the memory would be freed. That's the only way we've found to deal with it, and it seems, that's the preferred python way of doing things, using processes, IPC, fork, etc. instead of threads.
Phew! I'm glad I'm not the only one who is frustrated by this. I am willing to put the time in to fix this issue, as long as the Python community doesn't think this is a dumb idea.
In order to be able to release memory, the interpreter has to allocate memory in chunks bigger than the minimum that can be returned to the OS, e.g., in Linux that'd be 256bytes (iirc), so that libc's malloc would use mmap to allocate that chunk. Otherwise, if the memory was obtained with brk, then in most virtually all OSes and malloc implementations, it won't be returned to the OS even if the interpreter frees the memory.
Absolutely correct. Luckily, this is not a problem for Python, as its current behaviour is this: a) If the allocation is > 256 bytes, call the system malloc. b) If the allocation is < 256, use its own malloc implementation, which allocates memory in 256 kB chunks and never releases it. I want to change the behaviour of (b). If I call free() on these 256 kB chunks, they *do* in fact get released to the operating system.
That run will create a lot of little integer objects and the virtual memory size of the interpreter will quickly grow to 155MB and then drop to 117MB. The 117MB left are all those little integer objects that are not in use any more that the interpreter would reuse as needed.
Oops, you are right: I was incorrect in my previous post. Python does not limit the number of free integers, etc. that it keeps around. This is part of the problem: Each of Python's own native types maintains its own freelist. If you go create a huge list of ints, deallocate it, then create a huge list of floats, it will allocate more free memory, even though it doesn't need to. If the pymalloc allocator was used instead, it would recycle the memory that was used by the integers. At the moment, even if I change the behaviour of pymalloc, it still won't solve this problem, as each type keeps its own freelist. This is also on my TODO list: benchmark the performance of using the pymalloc allocator for these elements.
In our application, paging to swap is extremely bad because sometimes we're running the OS booted from the net without swap. The daemon has to loop over list of 20 to 40 thousand items at a time and it quickly grows to 60mb on the first run and then continues to grow from there. When something else needs memory, it tries to swap and then crashes.
This is another good point: Some systems do not have swap. The changes I am proposing would not make Python less memory hungry, but they would mean that
In the example above, the difference between 155MB and 117MB is 37MB, which I assume is the size of the list object returned by 'range()' which contains the references to the integers. The list goes away when the interpreter finishes running the loop and because it was already known how big it was going to be, it was allocated as a big chunk using mmap (my speculation). As a result, that memory was given back to the OS and the virtual memory size of the interpreter went down from 155MB to 117MB.
That is correct. If you look at the implementation for lists, it keeps a maximum of 80 free lists around, and immediately frees the memory for the containing array. Again, this seems like it is sub-optimal to me: In some cases, if a program uses a lot of lists, 80 lists may not be enough. For others, 80 may be too much. It seems to me that a more dynamic allocation policy could be more efficient. Evan Jones -- Evan Jones: http://evanjones.ca/ "Computers are useless. They can only give answers" - Pablo Picasso