[Python-ideas] Copy-on-write when forking a python process

Wed Apr 13 10:34:11 CEST 2011

On Tue, 12 Apr 2011 14:42:43 -0700 (PDT)
jac <john.theman.connor at gmail.com> wrote:

> Hi all,
> Sorry for cross posting, but I think that this group may actually be
> more appropriate for this discussion.  Previous thread is at:
> http://groups.google.com/group/comp.lang.python/browse_thread/thread/1df510595483b12f
> 
> I am wondering if anything can be done about the COW (copy-on-write)
> problem when forking a python process.  I have found several
> discussions of this problem, but I have seen no proposed solutions or
> workarounds.  My understanding of the problem is that an object's
> reference count is stored in the "ob_refcnt" field of the PyObject
> structure itself.  When a process forks, its memory is initially not
> copied. However, if any references to an object are made or destroyed
> in the child process, the page in which the objects "ob_refcnt" field
> is located in will be copied.

This smells like premature optimization to me. You're worried about
the kernel copying a few extra pages of user data when you're dealing
with a dictionary that's gigabytes in size. Sounds like any possibly
memory savings here would be much smaller than those that could come
from improving the data encoding.

But maybe it's not premature. Do you have measurements that show how
much extra swap space is taken up by COW copies caused by changing
reference counts in your application?

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org