
On Mon, 27 Jul 2020 at 23:24, Vinay Sharma vinay04sharma@icloud.com wrote:
Hi, Thanks for replying.
One thing that is worth thinking about is the safety of the API that is put together. A memory segment plus a separate detached semaphore or mutex can be used to build a safe API, but is not itself a safe API.
Agreed. That’s why I am more inclined to the second solution that I mentioned.
The second approach isn't clearly specified yet: is 'sync' in the name implying a mutex, an RW lock, or dependent on pointers to atomic types (which then becomes a portability challenge in some cases). The C++ atomics documentation you linked to documents a similar but differently named set of methods, so you'll need to clarify the difference you intend.> > For instance, we could have an object representing a memory range that
doesn't offer read/write at all, but allows:
- either one process write access over the range
- or any number of readers read access over the range
- allows subdividing the range (so that you can e.g. end one write
lock and keep another)
Where will this memory object be stored ?
There are a few options. The most obvious one given that bookkeeping data is required, is to build a separate layer offering this functionality, which uses the now batteries-included SHM facilities as part of its implementation, but doesn't directly surface it.
Locking a particular range instead of the whole memory segment will be relatively efficient because processes using different ranges can write simultaneously.
Since, this object will also be shared across multiple processes, there must be a safe way to update it.
There's a lot of prior art on named locks of various sorts, I'd personally be inclined to give the things a name that can be used across different processes in some form and bootstrap from there.
Any thoughts on that ?
On 27-Jul-2020, at 3:50 PM, Robert Collins robertc@robertcollins.net wrote:
On Sun, 26 Jul 2020 at 19:11, Vinay Sharma via Python-ideas python-ideas@python.org wrote:
Problem: Currently, let’s say I create a shared_memory segment using mulitprocessing.shared_memory.SharedMemory in Process 1 and open the same in Process 2. Then, I try to write some data to the shared memory segment using both the processes, so for me to prevent any race condition (data corruption), either these operations must be atomic, or I should be able to lock / unlock shared memory segment, which I cannot at the moment.
I earlier posted a solution to this problem, which received positive response, but there weren’t many responses to it, despite the fact this problem makes shared_memory practically unusable if there are simultaneous writes. So, the purpose of this post is to have discussion about the solution of the same.
One thing that is worth thinking about is the safety of the API that is put together. A memory segment plus a separate detached semaphore or mutex can be used to build a safe API, but is not itself a safe API.
A safe API shouldn't allow writes to the memory segment while the mutex is unlocked, rather than allowing one to build a safe API from the various pieces. (There may / will be lower level primitives that are unsafe).
We can look at a lot of the APIs in the Rust community for examples of this sort of thing.
Python doesn't have the borrow checker to enforce usage, but we could still work from the same basic principle - given there are multiple processes involved that make it easier to have safe outcomes.
For instance, we could have an object representing a memory range that doesn't offer read/write at all, but allows:
- either one process write access over the range
- or any number of readers read access over the range
- allows subdividing the range (so that you can e.g. end one write
lock and keep another)
For instance, https://doc.rust-lang.org/std/vec/struct.Vec.html#method.split_at_mut is an in-process API that is very similar.
-Rob