Copy-on-write when forking a python process

jac john.theman.connor at gmail.com
Tue Apr 12 17:13:20 EDT 2011


Heiko,
Thank you for pointing out POSH.  I have used some of python's other
shared memory facilities, but was completely unaware of POSH, it seems
nice.
Also, I agree that shared memory would solve the use-case I outlined
above, but it is not hard to imagine a slightly different case where
the child processes do want to mutate the dictionary, but do not want
the changes to show up in the parent process' dictionary. If shared
memory is used, then a more complicated algorithm would be needed to
get the desired behavior.
Long story short, shared-memory is great for some things.  Copy on
write is great for some other things. Sometimes they are easily
interchangeable, sometimes they are not.  Many, if not most languages,
allow the operating system to perform optimizations when a program
forks that python does not allow due to the way it counts references.
Many programs would be easier to write if python allowed the os to use
cow.
However, since I have gotten no other feedback in this list, I think
that I will post this in python-ideas as well.

Thanks,
--jac


On Apr 8, 4:29 pm, Heiko Wundram <modeln... at modelnine.org> wrote:
> Am 08.04.2011 20:34, schrieb jac:
>
> > I disagree with your statement that COW is an optimization for a
> > complete clone, it is an optimization that works at the memory page
> > level, not at the memory image level.  In other words, if I write to a
> > copy-on-write page, only that page is copied into my process' address
> > space, not the entire parent image.  To the best of my knowledge by
> > preventing the child process from altering an object's reference count
> > you can prevent the object from being copied (assuming the object is
> > not altered explicitly of course.)
>
> As I said before: COW for "sharing" a processes forked memory is simply
> an implementation-detail, and an _optimization_ (and of course a
> sensible one at that) for fork; there is no provision in the semantics
> of fork that an operating system should use COW memory-pages for
> implementing the copying (and early UNIXes didn't do that; they
> explicitly copied the complete process image for the child). The only
> semantic that is specified for fork is that the parent and the child
> have independent process images, that are equivalent copies (except for
> some details) immediately after the fork call has returned successfully
> (see SUSv4).
>
> What you're thinking of (and what's generally useful in the context
> you're describing) is shared memory; Python supports putting objects
> into shared memory using e.g. POSH (which is an extension that allows
> you to place Python objects in shared memory, using the SysV
> IPC-featureset that most UNIXes implement today).
>
> --
> --- Heiko.




More information about the Python-list mailing list