On Oct 19, 2004, at 14:00, Martin v. Löwis wrote:
Some posts to various lists  have stated that this is not a real problem because virtual memory takes care of it. This is fair if you are talking about a couple megabytes. In my case, I'm talking about ~700 MB of wasted RAM, which is a problem.
This is not true. The RAM is not wasted. As you explain later, the pages will be swapped out to swap space, making the RAM available again for other tasks.
Well, it isn't "wasted," but it is not optimal. If the pages were freed, the OS would use them for disk cache (or for other programs). However, because the operating system believes that these pages contain data it must either do one of the following two things:
a) Live with less disk cache (lower performance for disk I/O). b) Pre-emptively swap the pages to disk, which is super slow. (On Linux, you can control how pre-emptive the kernel is by adjusting the "swapiness" sysctl).
If it chooses to swap them out, the next time Python touches those pages, it will pause as the OS reads them back from disk.
It can only help the system's performance if we give it hints about which pages are no longer in use.
If Python ever wants to use them again, they will be brought it from swap.
Yes. However, your assumption is that Python never wants to use them again, because the peek memory consumption is only local.
I am trying to correct the situation where Python is not going to use the pages for a long time. For most applications, Python's memory allocation policies are fine, but if you have a long running process that does nothing most of the time (say a low usage server) or does some huge pre-processing (my application), it keeps a ton of memory around for no reason. Right now, Python has very poor performance for my application because I have this massive memory peak, and very low average memory usage.
Were I using Java, its usage would grow and shrink accordingly, thanks to the garbage collector releasing memory to the OS. Yes, with Python, we can't compact memory, but I think we can still do better than nothing.
As the working set grows or shrinks, pages get swapped in and out. As Tim explains, this is really hard to avoid.
If you actually tell the operating system that the pages are unused, it won't swap unless it actually needs to. Right now, a lot of pages are being swapped in and out that are actually *garbage*.
Unfortunately, as Tim explains, there is no way to reliably "inform" the system. free(3) may or may not be taken as such information.
As noted before, free() may not be sufficient, but mmap or madvise are.
The garbage collector holds the GIL. So while there could be other threads running, they must not manipulate any PyObject*. If they try to, they need to obtain the GIL first, which will make them block until the garbage collector is complete.
But as noted in a previous message, some extensions may not do this correctly, and try to do PyObject_Free anyway. Is that the problem that obmalloc tries to avoid? If the problem is only the possibility of PyObject_Free being called while another thread has the GIL, then I can probably avoid that issue.
That will ultimately depend on the patches. The feature itself would be fine, as Tim explains.
Great! That's basically what I am looking for.
However, patches might be rejected because:
Of course, I certainly hope that Python wouldn't accept garbage patches! :)
Thank you for your comments,
-- Evan Jones: http://evanjones.ca/ "Computers are useless. They can only give answers" - Pablo Picasso