
Hey Inada, thanks for the feedback
Generally speaking, fork is a legacy API. It is too difficult to know which library is fork-safe, even for stdlibs.
Yes, this is something that Instagram has to go into great lengths to make sure that we get the entire execution into a state where it's safe to fork. It works, but it's hard to maintain. We'd rather have a simpler model!
I hope per-interpreter GIL replaces fork use cases.
We hope so too, hence the big push towards having immutable shared state across the interpreters. For large applications like Instagram, this is a must, otherwise copying state into every interpreter would be too costly.
Anyway, I don't believe stopping refcounting will fix the CoW issue yet. See this article [1] again.
That article is five years old so it doesn't reflect the current state of the system! We have continuous profiling and monitoring of Copy on Writes and after introducing the techniques described in this PEP, we have largely fixed the majority of scenarios where this happens. You are right in the fact that just addressing reference counting will not fix all CoW issues. The trick here is also to leverage the permanent GC generation used for the `gc.freeze` API. That is, if you have a container that it's known to be immortal, it should be pushed into the permanent GC generation. This will guarantee that the GC itself will not change the GC headers of said instance. Thus, if you immortalize your heap before forking (using the techniques in: https://github.com/python/cpython/pull/31489) then you'll end up removing the vast majority of scenarios where CoW takes place. I can look into writing a new technical article for Instagram with more up to date info but this might take time to get through! Now, I said that we've largely fixed the CoW issue because there are still places where it happens such as: free lists, the small object allocator, etc. But these are relatively small compared to the ones coming from reference counts and the GC head mutations.