
One of the issues that showed up during the overlong TIOBE- thread and spinoffs is that there's no portable way to get a named shared memory segment (as distinguished from a disk-backed file) using the mmap module. Most unix variants provide a memory-backed file system that works for this, but it's name changes from distro to distro and even installation to installation. It's not clear to me that non-Unix platforms provide such a file system. The Posix solution is shm_open, which accepts a name for rendezvous and returns a file descriptor suitable for passing to mmap. Passing the file descriptor to anything but fstat, ftruncate, close and mmap is undefined. We'd also need to add shm_unlink to remove the shared segment, as the object created by shm_open isn't necessarily visible in the file system name space. shm_open has five values that can be used in it's flags argument, but those are shared with open and already available in the os module. This seems like a slam-dunk to me, but... 1) Is there some reason not to just add these two functions? 2) Are there any supported platforms with mmap and without shm_open/unlink? 3) Is this simple enough that a PEP isn't needed, just a patch in an issue? Thanks, <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Tue, 14 Feb 2012 18:50:44 -0500 Mike Meyer <mwm@mired.org> wrote:
A patch is enough. Note that this functionality is already available under Windows (though not really advertised in our docs), through the `tagname` parameter to mmap.mmap():
And in another session:
See http://docs.python.org/dev/library/mmap.html and http://msdn.microsoft.com/en-us/library/windows/desktop/aa366551%28v=vs.85%2... Regards Antoine.

On 15/02/2012 12:05am, Antoine Pitrou wrote:
It's not quite the same functionality since the lifetime of tagnamed mmaps is managed through handle refcounting. In some cases that is an advantage compared to open()/unlink(), and in others a disadvantage. Also, a problem with tagname is that there is no way to check whether the returned mmap was created by another process -- unless you resort to something like undocumented like from _multiprocessing import win32 f = mmap.mmap(-1, 4096, "mysharedmem") if win32.GetLastError() == win32.ERROR_ALREADY_EXISTS: raise ValueError('tagname already exists') sbt

On Wed, Feb 15, 2012 at 9:50 AM, Mike Meyer <mwm@mired.org> wrote:
This seems like a slam-dunk to me, but...
1) Is there some reason not to just add these two functions?
Not that I can see. Make sure to add an "Availabilty: Unix" marker in the relevant docs, though.
2) Are there any supported platforms with mmap and without shm_open/unlink?
The safest option is probably to add a configure check so we only expose these APIs when the underlying platform offers them. There's a *ton* of examples of such checks to copy from :)
3) Is this simple enough that a PEP isn't needed, just a patch in an issue?
Just a tracker issue will be fine - we expose additional posix APIs all the time without a PEP. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, 15 Feb 2012 10:07:23 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
I thought Windows was a Posix system? As such, it should have shm_open and shm_unlink, so the market wouldn't be appropriate. Thanks, <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Wed, Feb 15, 2012 at 12:25 PM, Mike Meyer <mwm@mired.org> wrote:
Not as far as I am aware - if it was, Cygwin wouldn't be needed as a compatibility layer to get POSIX software running. To get them to work properly on Windows, many modules that interface with the OS have to use the win32 API directly rather than relying on the native implementations of the POSIX APIs.
As such, it should have shm_open and shm_unlink, so the market wouldn't be appropriate.
In this case, it sounds like Windows may already have a roughly equivalent mechanism in mmap, so cross-platform support may be feasible. If that's the case, a marker won't be needed. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, 15 Feb 2012 13:07:54 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
The "tagname" feature in the windows version uses ref counting to free the shared segment when no one is using it. shm_open requires someone to call shm_unlink, but doesn't actually remove it until there are no more references to it. However you can't shm_open it again after shm_unlink'ing (expected on Unix, and verified on my FBSD box). We could sorta-kinda emulate the windows "tagname" behavior using shm_open. I'd prefer to provide shm_open on Windows if at all possible. The "sorta-kinda" bothers me. That would also allow for an application to exit and then resume work stored in a mapped segment (something I've done before). However, setting this up on Windows isn't something I can do. Thanks, <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Wed, Feb 15, 2012 at 2:10 PM, Mike Meyer <mwm@mired.org> wrote:
That's the purpose of the "Availability" markers in the docs - to allow a POSIX implementation to be added directly, then, if it's confirmed to work on Windows, or someone implements the necessary additional parts to make it work, the Availability restriction can be dropped. The OS interface on Windows is just too different for us to gate all OS service additions on having a working Windows version of the feature. (It's not *ideal* when that happens, of course, but it's a practical concession to the fact that our pool of Windows developers is significantly smaller than our pool of *nix and OS X developers). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 15/02/2012 4:10am, Mike Meyer wrote:
Maybe creating a file using CreateFile and FILE_ATTRIBUTES_TEMPORARY would have a similar effect - it hints to the system to avoid flushing to the disk. (os.open and O_TEMPORARY would not work because that also causes the file to be removed when all handles are closed.) sbt

Do the people here want to shift over to the concurrency maililng list? Would be nicer in there with a few more people

On Wed, 15 Feb 2012 13:25:14 +0000 shibturn <shibturn@gmail.com> wrote:
Can you elaborate? I would think the general use case is to keep an mmap alive as long as you need it, so I don't understand why someone would destroy an mmap just after sending it to another process. Regards Antoine.

On 15/02/2012 5:02pm, Antoine Pitrou wrote:
A process which creates an mmap may want to transfer ownership of the mmap to another process along a pipeline. For example: 1) Process A creates an mmap 2) Process A does some work on mmap 3) Process A puts mmap on a queue. 4) mmap gets garbage collected in process A. 5) Process B gets mmap from queue. ... With refcounting the mmap will be destroyed at step 4. With shm_open/shm_unlink, it would be Process B's responsibility to unlink the file. This is the scenario which Sturla Molden was concerned with, although he hadn't thought through the premature disposal issue. sbt P.S. I have posted a possible implementation of shm_open/shm_unlink for Windows at http://mail.python.org/pipermail/concurrency-sig/2012-February/000058.html

P.S. I have posted a possible implementation of shm_open/shm_unlink for Windows at
http://mail.python.org/pipermail/concurrency-sig/2012-February/000058.html
A temporary file is not backed shared memory on Windows, but is a persistent file on disk. You have to mmap from the OS' paging file to get shared memory. Sturla

On 16.02.2012 02:40, Sturla Molden wrote:
Hmm... It seems files created with the flag FILE_ATTRIBUTE_TEMPORARY is backed by memory if possible. Though MSDN does not say if it is shared memory that can be used for IPC. A blog article on MSDN from 2004 indicates that the combination FILE_ATTRIBUTE_TEMPORARY|FILE_FLAG_DELETE_ON_CLOSE is needed. The Windows systems programming book from MS Press does not mention FILE_ATTRIBUTE_TEMPORARY for temporary files. So it seems most Windows programmers are actually creating permanent files in the temp file folder, rather than creating temporary files. So the cause for buil-up of temporary files on Windows is actually a wide-spread programming error, not the fault of the operating system. It seems tmpfile.NamedTemporaryFile will use FILE_ATTRIBUTE_TEMPORARY on Windows if called with delete=True. But is does not use FILE_FLAG_DELETE_ON_CLOSE as well, which probably is an error (particularly if the "delete" keyword argument should make sence). Sturla

On 16/02/2012 1:40am, Sturla Molden wrote:
An mmap can certainly be used as shared memory when it is backed by a real file. Or are you saying that it would work but be much slower? Also, according to this msdn blog http://blogs.msdn.com/b/larryosterman/archive/2004/04/19/116084.aspx if you open a file using FILE_ATTRIBUTE_TEMPORARY and FILE_FLAG_DELETE_ON_CLOSE the file will not be flushed to the disk unless there is memory pressure. sbt

On 16.02.2012 17:51, shibturn wrote:
An mmap can certainly be used as shared memory when it is backed by a real file. Or are you saying that it would work but be much slower?
For FILE_ATTRIBUTE_TEMPORARY, I am not sure if the memory is shared or private. (I.e. if using it for IPC will involve disk access.) mmap can certainly be used for shared memory. Sturla

On 16/02/2012 8:40pm, Sturla Molden wrote:
For FILE_ATTRIBUTE_TEMPORARY, I am not sure if the memory is shared or private. (I.e. if using it for IPC will involve disk access.)
Even if it is backed by a perfectly normal file, using an mmap for IPC does not require disk access if the relevant pages have not been evicted from memory. FILE_ATTRIBUTE_TEMPORARY only affects how eager the system is to flush modified data to the disk. sbt

On 16/02/12 06:43, shibturn wrote:
I don't know about Windows, but in Unix it's possible to send a file descriptor from one process to another over a unix-domain socket connection. So a refcounted anonymous mmap handover could be achieved this way: 1. Process A creates a temp file, mmaps it and unlinks it. 2. Process A sends the file descriptor to process B over a unix-domain socket. 3. Process B mmaps it. Even if process A closes its version of the fd right after sending it, the OS should keep it alive while it's in transit, I think. -- Greg

On 16/02/2012 1:56am, Greg Ewing wrote:
If the receiving process is expecting an fd then that certainly works. But making it work transparently with pickle is difficult. (multiprocessing.reduction tried making it transparent using a background thread to accept requests for fds from unpickling processes. But that functionality has been disabled.) On Windows one rather cleaner possibility is for the process pickling the handle to use DuplicateHandle() to copy the handle to the main process. Then the receiving process can copy the handle from the main process, removing it from the main process at the same time by using "dwOptions=DUPLICATE_CLOSE_SOURCE". Since the main process will not exit before its descendants, that will solve the keep-alive problem. (I have managed to produce a working example of this scheme for transfering a file handle.) sbt

shibturn wrote:
If the receiving process is expecting an fd then that certainly works. But making it work transparently with pickle is difficult.
Is making it work with pickle a requirement? The point of using shared memory is to avoid the need for serialising and deserialising. -- Greg

On Tue, 14 Feb 2012 18:50:44 -0500 Mike Meyer <mwm@mired.org> wrote:
A patch is enough. Note that this functionality is already available under Windows (though not really advertised in our docs), through the `tagname` parameter to mmap.mmap():
And in another session:
See http://docs.python.org/dev/library/mmap.html and http://msdn.microsoft.com/en-us/library/windows/desktop/aa366551%28v=vs.85%2... Regards Antoine.

On 15/02/2012 12:05am, Antoine Pitrou wrote:
It's not quite the same functionality since the lifetime of tagnamed mmaps is managed through handle refcounting. In some cases that is an advantage compared to open()/unlink(), and in others a disadvantage. Also, a problem with tagname is that there is no way to check whether the returned mmap was created by another process -- unless you resort to something like undocumented like from _multiprocessing import win32 f = mmap.mmap(-1, 4096, "mysharedmem") if win32.GetLastError() == win32.ERROR_ALREADY_EXISTS: raise ValueError('tagname already exists') sbt

On Wed, Feb 15, 2012 at 9:50 AM, Mike Meyer <mwm@mired.org> wrote:
This seems like a slam-dunk to me, but...
1) Is there some reason not to just add these two functions?
Not that I can see. Make sure to add an "Availabilty: Unix" marker in the relevant docs, though.
2) Are there any supported platforms with mmap and without shm_open/unlink?
The safest option is probably to add a configure check so we only expose these APIs when the underlying platform offers them. There's a *ton* of examples of such checks to copy from :)
3) Is this simple enough that a PEP isn't needed, just a patch in an issue?
Just a tracker issue will be fine - we expose additional posix APIs all the time without a PEP. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, 15 Feb 2012 10:07:23 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
I thought Windows was a Posix system? As such, it should have shm_open and shm_unlink, so the market wouldn't be appropriate. Thanks, <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Wed, Feb 15, 2012 at 12:25 PM, Mike Meyer <mwm@mired.org> wrote:
Not as far as I am aware - if it was, Cygwin wouldn't be needed as a compatibility layer to get POSIX software running. To get them to work properly on Windows, many modules that interface with the OS have to use the win32 API directly rather than relying on the native implementations of the POSIX APIs.
As such, it should have shm_open and shm_unlink, so the market wouldn't be appropriate.
In this case, it sounds like Windows may already have a roughly equivalent mechanism in mmap, so cross-platform support may be feasible. If that's the case, a marker won't be needed. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, 15 Feb 2012 13:07:54 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
The "tagname" feature in the windows version uses ref counting to free the shared segment when no one is using it. shm_open requires someone to call shm_unlink, but doesn't actually remove it until there are no more references to it. However you can't shm_open it again after shm_unlink'ing (expected on Unix, and verified on my FBSD box). We could sorta-kinda emulate the windows "tagname" behavior using shm_open. I'd prefer to provide shm_open on Windows if at all possible. The "sorta-kinda" bothers me. That would also allow for an application to exit and then resume work stored in a mapped segment (something I've done before). However, setting this up on Windows isn't something I can do. Thanks, <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

On Wed, Feb 15, 2012 at 2:10 PM, Mike Meyer <mwm@mired.org> wrote:
That's the purpose of the "Availability" markers in the docs - to allow a POSIX implementation to be added directly, then, if it's confirmed to work on Windows, or someone implements the necessary additional parts to make it work, the Availability restriction can be dropped. The OS interface on Windows is just too different for us to gate all OS service additions on having a working Windows version of the feature. (It's not *ideal* when that happens, of course, but it's a practical concession to the fact that our pool of Windows developers is significantly smaller than our pool of *nix and OS X developers). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 15/02/2012 4:10am, Mike Meyer wrote:
Maybe creating a file using CreateFile and FILE_ATTRIBUTES_TEMPORARY would have a similar effect - it hints to the system to avoid flushing to the disk. (os.open and O_TEMPORARY would not work because that also causes the file to be removed when all handles are closed.) sbt

Do the people here want to shift over to the concurrency maililng list? Would be nicer in there with a few more people

On Wed, 15 Feb 2012 13:25:14 +0000 shibturn <shibturn@gmail.com> wrote:
Can you elaborate? I would think the general use case is to keep an mmap alive as long as you need it, so I don't understand why someone would destroy an mmap just after sending it to another process. Regards Antoine.

On 15/02/2012 5:02pm, Antoine Pitrou wrote:
A process which creates an mmap may want to transfer ownership of the mmap to another process along a pipeline. For example: 1) Process A creates an mmap 2) Process A does some work on mmap 3) Process A puts mmap on a queue. 4) mmap gets garbage collected in process A. 5) Process B gets mmap from queue. ... With refcounting the mmap will be destroyed at step 4. With shm_open/shm_unlink, it would be Process B's responsibility to unlink the file. This is the scenario which Sturla Molden was concerned with, although he hadn't thought through the premature disposal issue. sbt P.S. I have posted a possible implementation of shm_open/shm_unlink for Windows at http://mail.python.org/pipermail/concurrency-sig/2012-February/000058.html

P.S. I have posted a possible implementation of shm_open/shm_unlink for Windows at
http://mail.python.org/pipermail/concurrency-sig/2012-February/000058.html
A temporary file is not backed shared memory on Windows, but is a persistent file on disk. You have to mmap from the OS' paging file to get shared memory. Sturla

On 16.02.2012 02:40, Sturla Molden wrote:
Hmm... It seems files created with the flag FILE_ATTRIBUTE_TEMPORARY is backed by memory if possible. Though MSDN does not say if it is shared memory that can be used for IPC. A blog article on MSDN from 2004 indicates that the combination FILE_ATTRIBUTE_TEMPORARY|FILE_FLAG_DELETE_ON_CLOSE is needed. The Windows systems programming book from MS Press does not mention FILE_ATTRIBUTE_TEMPORARY for temporary files. So it seems most Windows programmers are actually creating permanent files in the temp file folder, rather than creating temporary files. So the cause for buil-up of temporary files on Windows is actually a wide-spread programming error, not the fault of the operating system. It seems tmpfile.NamedTemporaryFile will use FILE_ATTRIBUTE_TEMPORARY on Windows if called with delete=True. But is does not use FILE_FLAG_DELETE_ON_CLOSE as well, which probably is an error (particularly if the "delete" keyword argument should make sence). Sturla

On 16/02/2012 1:40am, Sturla Molden wrote:
An mmap can certainly be used as shared memory when it is backed by a real file. Or are you saying that it would work but be much slower? Also, according to this msdn blog http://blogs.msdn.com/b/larryosterman/archive/2004/04/19/116084.aspx if you open a file using FILE_ATTRIBUTE_TEMPORARY and FILE_FLAG_DELETE_ON_CLOSE the file will not be flushed to the disk unless there is memory pressure. sbt

On 16.02.2012 17:51, shibturn wrote:
An mmap can certainly be used as shared memory when it is backed by a real file. Or are you saying that it would work but be much slower?
For FILE_ATTRIBUTE_TEMPORARY, I am not sure if the memory is shared or private. (I.e. if using it for IPC will involve disk access.) mmap can certainly be used for shared memory. Sturla

On 16/02/2012 8:40pm, Sturla Molden wrote:
For FILE_ATTRIBUTE_TEMPORARY, I am not sure if the memory is shared or private. (I.e. if using it for IPC will involve disk access.)
Even if it is backed by a perfectly normal file, using an mmap for IPC does not require disk access if the relevant pages have not been evicted from memory. FILE_ATTRIBUTE_TEMPORARY only affects how eager the system is to flush modified data to the disk. sbt

On 16/02/12 06:43, shibturn wrote:
I don't know about Windows, but in Unix it's possible to send a file descriptor from one process to another over a unix-domain socket connection. So a refcounted anonymous mmap handover could be achieved this way: 1. Process A creates a temp file, mmaps it and unlinks it. 2. Process A sends the file descriptor to process B over a unix-domain socket. 3. Process B mmaps it. Even if process A closes its version of the fd right after sending it, the OS should keep it alive while it's in transit, I think. -- Greg

On 16/02/2012 1:56am, Greg Ewing wrote:
If the receiving process is expecting an fd then that certainly works. But making it work transparently with pickle is difficult. (multiprocessing.reduction tried making it transparent using a background thread to accept requests for fds from unpickling processes. But that functionality has been disabled.) On Windows one rather cleaner possibility is for the process pickling the handle to use DuplicateHandle() to copy the handle to the main process. Then the receiving process can copy the handle from the main process, removing it from the main process at the same time by using "dwOptions=DUPLICATE_CLOSE_SOURCE". Since the main process will not exit before its descendants, that will solve the keep-alive problem. (I have managed to produce a working example of this scheme for transfering a file handle.) sbt

shibturn wrote:
If the receiving process is expecting an fd then that certainly works. But making it work transparently with pickle is difficult.
Is making it work with pickle a requirement? The point of using shared memory is to avoid the need for serialising and deserialising. -- Greg
participants (7)
-
Antoine Pitrou
-
Christopher Reay
-
Greg Ewing
-
Mike Meyer
-
Nick Coghlan
-
shibturn
-
Sturla Molden