On Mon, 27 Jul 2020 at 23:24, Vinay Sharma <
vinay04sharma@icloud.com> wrote:
Hi, Thanks for replying.
One thing that is worth thinking about is the safety of the API that
is put together. A memory segment plus a separate detached semaphore
or mutex can be used to build a safe API, but is not itself a safe
API.
Agreed. That’s why I am more inclined to the second solution that I mentioned.
The second approach isn't clearly specified yet: is 'sync' in the name
implying a mutex, an RW lock, or dependent on pointers to atomic types
(which then becomes a portability challenge in some cases). The C++
atomics documentation you linked to documents a similar but
differently named set of methods, so you'll need to clarify the
difference you intend.
Python has support for atomic types, I guess:
And, these methods don’t use any locks, they are just atomic operations.
So, my approach was to lock the whole shared memory segment at once, and to do that we can store an integer at the beginning of every shared memory segment, which will denote whether this segment is locked (1), or unlocked (0), and atomic operations can be used to update this integer ( 0 -> 1) lock, (1 -> 0) unlock. Although, `wait` function will have to be implemented like in semaphores, which will wait until the segment is free (becomes 0).
> > For instance, we could have an object
representing a memory range that
doesn't offer read/write at all, but allows:
- either one process write access over the range
- or any number of readers read access over the range
- allows subdividing the range (so that you can e.g. end one write
lock and keep another)
Where will this memory object be stored ?
There are a few options. The most obvious one given that bookkeeping
data is required, is to build a separate layer offering this
functionality, which uses the now batteries-included SHM facilities as
part of its implementation, but doesn't directly surface it.
Can you please elaborate more on this ?
I understand that shared memory will be used to store ranges and whether they are being locked/unlocked, etc. But if multiple process can update this data, then we will also have to think about the synchronisation of this book-keeping data.
So, I guess you mean to say that all processes will be allotted shared memory using a separate API/layer, which will take care of book-keeping, and since this separate API/layer will be only responsible for book-keeping, there will be no need to synchronise book-keeping data.
But, then the question arises how will unrelated processes communicate with this layer/API to request shared memory.
One way could be that a separate process managing this book-keeping could be created, and other process will request access/lock/unlock using this separate process.
And the communication between between this layer (separate process) and the other processes (using shared memory) will be using some form of IPC.
Locking a particular range instead of the whole memory segment will be relatively efficient because processes using different ranges can write simultaneously.
Since, this object will also be shared across multiple processes, there must be a safe way to update it.
There's a lot of prior art on named locks of various sorts, I'd
personally be inclined to give the things a name that can be used
across different processes in some form and bootstrap from there.
Any thoughts on that ?
On 27-Jul-2020, at 3:50 PM, Robert Collins <robertc@robertcollins.net> wrote:
On Sun, 26 Jul 2020 at 19:11, Vinay Sharma via Python-ideas
<python-ideas@python.org> wrote:
Problem:
Currently, let’s say I create a shared_memory segment using mulitprocessing.shared_memory.SharedMemory in Process 1 and open the same in Process 2.
Then, I try to write some data to the shared memory segment using both the processes, so for me to prevent any race condition (data corruption), either these operations must be atomic, or I should be able to lock / unlock shared memory segment, which I cannot at the moment.
I earlier posted a solution to this problem, which received positive response, but there weren’t many responses to it, despite the fact this problem makes shared_memory practically unusable if there are simultaneous writes.
So, the purpose of this post is to have discussion about the solution of the same.
One thing that is worth thinking about is the safety of the API that
is put together. A memory segment plus a separate detached semaphore
or mutex can be used to build a safe API, but is not itself a safe
API.
A safe API shouldn't allow writes to the memory segment while the
mutex is unlocked, rather than allowing one to build a safe API from
the various pieces. (There may / will be lower level primitives that
are unsafe).
We can look at a lot of the APIs in the Rust community for examples of
this sort of thing.
Python doesn't have the borrow checker to enforce usage, but we could
still work from the same basic principle - given there are multiple
processes involved that make it easier to have safe outcomes.
For instance, we could have an object representing a memory range that
doesn't offer read/write at all, but allows:
- either one process write access over the range
- or any number of readers read access over the range
- allows subdividing the range (so that you can e.g. end one write
lock and keep another)
For instance, https://doc.rust-lang.org/std/vec/struct.Vec.html#method.split_at_mut
is an in-process API that is very similar.
-Rob
_______________________________________________
Python-ideas mailing list --
python-ideas@python.orgTo unsubscribe send an email to
python-ideas-leave@python.orghttps://mail.python.org/mailman3/lists/python-ideas.python.org/Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7BDCJYNXUJY6S3H3B3EDZZV5ZIUJOWD5/
Code of Conduct: http://python.org/psf/codeofconduct/