Copy-on-write when forking a python process
John Connor
john.theman.connor at gmail.com
Fri Apr 8 12:14:19 EDT 2011
Hi all,
Long time reader, first time poster.
I am wondering if anything can be done about the COW (copy-on-write)
problem when forking a python process. I have found several
discussions of this problem, but I have seen no proposed solutions or
workarounds. My understanding of the problem is that an object's
reference count is stored in the "ob_refcnt" field of the PyObject
structure itself. When a process forks, its memory is initially not
copied. However, if any references to an object are made or destroyed
in the child process, the page in which the objects "ob_refcnt" field
is located in will be copied.
My first thought was the obvious one: make the ob_refcnt field a
pointer into an array of all object refcounts stored elsewhere.
However, I do not think that there would be a way of doing this
without adding a lot of complexity. So my current thinking is that it
should be possible to disable refcounting for an object. This could
be done by adding a field to PyObject named "ob_optout". If ob_optout
is true then py_INCREF and py_DECREF will have no effect on the
object:
from refcount import optin, optout
class Foo: pass
mylist = [Foo() for _ in range(10)]
optout(mylist) # Sets ob_optout to true
for element in mylist:
optout(element) # Sets ob_optout to true
Fork_and_block_while_doing_stuff(mylist)
optin(mylist) # Sets ob_optout to false
for element in mylist:
optin(element) # Sets ob_optout to false
Has anyone else looked into the COW problem? Are there workarounds
and/or other plans to fix it? Does the solution I am proposing sound
reasonable, or does it seem like overkill? Does anyone foresee any
problems with it?
Thanks,
--jac
More information about the Python-list
mailing list