Mailman 3 multiprocessing IPC - Python-ideas

newer
Re: [Python-ideas] Python 3000...

multiprocessing IPC

older
The concurrency discussion is...

Sturla Molden

Feb. 10, 2012

11:36 p.m.

Den 10.02.2012 22:15, skrev Mike Meyer:

...

In what way does the mmap module fail to provide your binary file interface? <mike

The short answer is that BSD mmap creates an anonymous kernel object. When working with multiprocessing for a while, one comes to the conclusion that we really need named kernel objects. Here are two simple fail cases for anonymous kernel objects: - Process A spawns/forks process B. - Process B creates an object, one of the attributes is a lock. - Fail: This object cannot be communicated back to process A. B inherits from A, A does not inherit from B. - Process A spawns/forks a process pool. - Process A creates an object, one of the attributes is a lock. - Fail: This object cannot be communicated to the pool. They do not inherit new handles from A after they are started. All of multiprocessing's IPC classes suffer from this! Solution: Use named kernel objects for IPC, pickle the name. I made a shared memory array for NumPy that workes like this -- implemented by memory mapping from the paging file on Windows, System V IPC on Linux. Underneath is an extension class that allocates a shared memory buffer. When pickled it encodes the kernel name, not its content, and unpickling opens the object given its name. There is another drawback too: The speed of pickle. For example, sharing NumPy arrays with pickle is not faster with shared memory. The overhead from pickle completely dominate the time needed for IPC . That is why I want a type specialized or a binary channel. Making this from the named shared memory class I already have is a no-brainer. So that is my other objection against multiprocessing. That is: 1. Object sharing by handle inheritance fails when kernel objects must be passed back to the parent process or to a process pool. We need IPC objects that have a name in the kernel, so they can be created and shared in retrospect. 2. IPC with multiprocessing is too slow due to pickle. We need something that does not use pickle. (E.g. shared memory, but not by means of mmap.) It might be that the pipe or socket in multiprocessing will do this (I have not looked at it carefully enough), but they still don't have Proof of concept: http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip Dependency on Cython and NumPy should probably be removed, never mind that. Important part is this: sharedmemory_sysv.pyx (Linux) sharedmemory_win.pyx and ntqueryobject.c (Windows) Finally, I'd like to say that I think Python's standard lib should support high-performance asynchronous I/O for concurrency. That is not poll/select (on Windows it does not even work properly). Rather, I want IOCP on Windows, epoll on Linux, and kqueue on Mac. (Yes I know about twisted.) There should also be a requirement that it works with multiprocessing. E.g. if we open a process pool, the processes should be able to use the same IOCP. In other words some highly scalable asynchronous I/O that works with multiprocessing. So ... As far as I am concerned, the only thing worth keeping in multipricessing is multiprocessing.Process and multiprocessing.Pool. The rest doesn't do what we want. Sturla

Show replies by date

Jesse Noller

February 2012

12:04 a.m.

On Feb 10, 2012, at 6:36 PM, Sturla Molden <sturla@molden.no> wrote:

...

Den 10.02.2012 22:15, skrev Mike Meyer:

...
In what way does the mmap module fail to provide your binary file interface? <mike

The short answer is that BSD mmap creates an anonymous kernel object. When working with multiprocessing for a while, one comes to the conclusion that we really need named kernel objects.

Here are two simple fail cases for anonymous kernel objects:

- Process A spawns/forks process B. - Process B creates an object, one of the attributes is a lock. - Fail: This object cannot be communicated back to process A. B inherits from A, A does not inherit from B.

- Process A spawns/forks a process pool. - Process A creates an object, one of the attributes is a lock. - Fail: This object cannot be communicated to the pool. They do not inherit new handles from A after they are started.

All of multiprocessing's IPC classes suffer from this!

Solution:

Use named kernel objects for IPC, pickle the name.

I made a shared memory array for NumPy that workes like this -- implemented by memory mapping from the paging file on Windows, System V IPC on Linux. Underneath is an extension class that allocates a shared memory buffer. When pickled it encodes the kernel name, not its content, and unpickling opens the object given its name.

There is another drawback too:

The speed of pickle. For example, sharing NumPy arrays with pickle is not faster with shared memory. The overhead from pickle completely dominate the time needed for IPC . That is why I want a type specialized or a binary channel. Making this from the named shared memory class I already have is a no-brainer.

So that is my other objection against multiprocessing.

That is:

1. Object sharing by handle inheritance fails when kernel objects must be passed back to the parent process or to a process pool. We need IPC objects that have a name in the kernel, so they can be created and shared in retrospect.

2. IPC with multiprocessing is too slow due to pickle. We need something that does not use pickle. (E.g. shared memory, but not by means of mmap.) It might be that the pipe or socket in multiprocessing will do this (I have not looked at it carefully enough), but they still don't have

Proof of concept:

http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip

Dependency on Cython and NumPy should probably be removed, never mind that. Important part is this:

sharedmemory_sysv.pyx (Linux) sharedmemory_win.pyx and ntqueryobject.c (Windows)

Finally, I'd like to say that I think Python's standard lib should support high-performance asynchronous I/O for concurrency. That is not poll/select (on Windows it does not even work properly). Rather, I want IOCP on Windows, epoll on Linux, and kqueue on Mac. (Yes I know about twisted.) There should also be a requirement that it works with multiprocessing. E.g. if we open a process pool, the processes should be able to use the same IOCP. In other words some highly scalable asynchronous I/O that works with multiprocessing.

So ... As far as I am concerned, the only thing worth keeping in multipricessing is multiprocessing.Process and multiprocessing.Pool. The rest doesn't do what we want.

Sturla

Sturla, I think I've talked to you before - patches to improve multiprocessing from you are definitely welcome, and needed. I disagree with tossing as much out as you are suggesting - managers are pretty useful, for example, but the entire team and especially me would welcome patches to improve things. Jesse

Sturla Molden

3:10 p.m.

Den 11.02.2012 00:36, skrev Sturla Molden:

...

Proof of concept:

http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip

Sorry, wrong version. Use this instead: http://dl.dropbox.com/u/12464039/sharedmem.zip Sturla

Antoine Pitrou

3:27 p.m.

On Sat, 11 Feb 2012 00:36:15 +0100 Sturla Molden <sturla@molden.no> wrote:

...

This is not trivial (especially the IOCP part, if I consider the amount of code Twisted has for that).

...

Ouch. Regards Antoine.

Mike Meyer

1:52 a.m.

pwdOn Sat, 11 Feb 2012 00:36:15 +0100 Sturla Molden <sturla@molden.no> wrote:

...

Den 10.02.2012 22:15, skrev Mike Meyer:

...
In what way does the mmap module fail to provide your binary file interface? <mike The short answer is that BSD mmap creates an anonymous kernel object.

First, I didn't ask about "BSD mmap", I asked about the "mmap module". They aren't the same thing.

...

When working with multiprocessing for a while, one comes to the conclusion that we really need named kernel objects.

And both the BSD mmap (at least in recent systems) and the mmap module provide objects with names in the file system space. IIUC, while there are systems that won't let you create anonymous objects (like early versions of the mmap module), there aren't any - at least any longer - that won't let you create named objects.

...

Here are two simple fail cases for anonymous kernel objects:

[elided, since the restriction doesn't exist]

...

All of multiprocessing's IPC classes suffer from this!

Some of them may. The one I asked about doesn't.

...

Solution:

Use named kernel objects for IPC, pickle the name.

You don't need to pickle the name if you use mmap's native name system - it's just a string.

...

There is another drawback too:

The speed of pickle. For example, sharing NumPy arrays with pickle is not faster with shared memory. The overhead from pickle completely dominate the time needed for IPC . That is why I want a type specialized or a binary channel. Making this from the named shared memory class I already have is a no-brainer.

...

So that is my other objection against multiprocessing.

1. Object sharing by handle inheritance fails when kernel objects must be passed back to the parent process or to a process pool. We need IPC objects that have a name in the kernel, so they can be created and shared in retrospect.

We've already got that one. You just need to learn how to use it.

...

2. IPC with multiprocessing is too slow due to pickle. We need something that does not use pickle. (E.g. shared memory, but not by means of mmap.) It might be that the pipe or socket in multiprocessing will do this (I have not looked at it carefully enough), but they still don't have

Since can use pickle, you're only dealing with small amounts of data. There are better performing serialization tools available (or they can easily be created if you have to deal with large amounts of data), and those work fine for a large variety of problems. If they aren't fast enough, neither a socket nor a pipe will solve the basic issue of needing to serialize the data in order to communicate it. This isn't a problem with mmap per se, and it's not a problem that anything that can be accurately described as a "file" - as in your "binary file interface" - is going to solve. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Sturla Molden

3:46 a.m.

Den 12.02.2012 02:52, skrev Mike Meyer:
> First, I didn't ask about "BSD mmap", I asked about the "mmap module". 
> They aren't the same thing. 

Take a look at the implementation.

>> When working with multiprocessing for a while, one comes to the
>> conclusion that we really need named kernel objects.
> And both the BSD mmap (at least in recent systems) and the mmap module
> provide objects with names in the file system space. IIUC, while there
> are systems that won't let you create anonymous objects (like early
> versions of the mmap module), there aren't any - at least any longer -
> that won't let you create named objects.

Sure, you can memory map named files. You can even memory map from 
/dev/shm on a system that supports it, if you are willing to reserve 
some RAM for ramdisk.

But apart from that, show me how you would use the mmap module to make 
named shared memory on Linux or Windows. No, memory mapping file object 
-1 or 0 don't count, you get an anonymous memory mapping.

Here is a task for you to try:

1. start a process
2. in the new process, create some shared memory (use the mmap module)
3. make the parent process get access to it (should be easy, right?)

Can you do this? No?

Then try the same thing with a lock (multiprocessing.Lock) or an event.

Show me how you would code this.

>
> >  Use named kernel objects for IPC, pickle the name.
> You don't need to pickle the name if you use mmap's native name system
> - it's just a string.

Sure, multiprocessing does not pickle strings objects. Or whatever. Have 
you ever looked at the code?



> Since can use pickle, you're only dealing with small amounts of
> data.

What on earth are you talking about?

Every object passed in the "args" keyword argument to 
multiprocessing.Process is pickled. Same thing for any object you pass 
to multiprocessing.Queue.

Look at the code.



Sturla

Mike Meyer

8:02 a.m.

On Sun, 12 Feb 2012 04:46:00 +0100
Sturla Molden <sturla@molden.no> wrote:

> Den 12.02.2012 02:52, skrev Mike Meyer:
> > First, I didn't ask about "BSD mmap", I asked about the "mmap module". 
> > They aren't the same thing. 
> Take a look at the implementation.

True, but we're talking about an API, not a specific implementation.

> >> When working with multiprocessing for a while, one comes to the
> >> conclusion that we really need named kernel objects.
> > And both the BSD mmap (at least in recent systems) and the mmap module
> > provide objects with names in the file system space. IIUC, while there
> > are systems that won't let you create anonymous objects (like early
> > versions of the mmap module), there aren't any - at least any longer -
> > that won't let you create named objects.
> Sure, you can memory map named files. You can even memory map from 
> /dev/shm on a system that supports it, if you are willing to reserve 
> some RAM for ramdisk.

And that's *not* the anonymous kernel object you complained about
getting from mmap.

> But apart from that, show me how you would use the mmap module to make 
> named shared memory on Linux or Windows. No, memory mapping file object 
> -1 or 0 don't count, you get an anonymous memory mapping.

The linux mmap has the same arguments as the BSD one, so I'd expect it
to work the same. I expect that the Python core will have made the
semantics work properly on Windows, but don't really care, and don't
have a Windows system to test it on. And that's why I'm talking about
the API, not the implementation.

> Here is a task for you to try:
> 
> 1. start a process
> 2. in the new process, create some shared memory (use the mmap module)
> 3. make the parent process get access to it (should be easy, right?)
> Can you do this? No?

Works exactly like I'd expect it to.

> Show me how you would code this.

Here's the code that creates the shared file:

    share_name = '/tmp/xyzzy'
    with open(share_name, 'wb') as f:
        f.write(b'hello')

Here's the code for the child:

    with open(share_name, 'r+b') as f:
        share = mmap(f.fileno(), 0)
        share[:5] = b'gone\n'

Here's the code for the parent:

    child = Process(target=proc)
    child.start()
    with open(share_name, mode='r+b') as f:
        share = mmap(f.fileno(), 0)
        while share[0] == ord('h'):
            sleep(1)
        print('main:', share.readline())

> > >  Use named kernel objects for IPC, pickle the name.
> > You don't need to pickle the name if you use mmap's native name system
> > - it's just a string.
> Sure, multiprocessing does not pickle strings objects. Or whatever. Have 
> you ever looked at the code?

I didn't say multiprocessing wouldn't pickle the name, *or* anything
else about the multiprocessing module. I said *you* didn't need to
pickle it. And I didn't. Did you read what I wrote?
  
> Every object passed in the "args" keyword argument to 
> multiprocessing.Process is pickled. Same thing for any object you pass 
> to multiprocessing.Queue.

Yes, but we're not talking about multiprocessing.Queue. We're talking
about mmap. multiprocessing.Queue doesn't use mmap. For that, you want
to us multiprocessing.Value and multiprocessing.Array.

> Look at the code.

Look at the text.

     <mike
-- 
Mike Meyer <mwm@mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Sturla Molden

2:15 p.m.

Den 12.02.2012 09:02, skrev Mike Meyer:

...

True, but we're talking about an API, not a specific implementation.

You have been complaining about the GIL which is a specific implementation. I am talking about how multiprocessing actually works, i.e. implementation.

...

...
But apart from that, show me how you would use the mmap module to make named shared memory on Linux or Windows. No, memory mapping file object -1 or 0 don't count, you get an anonymous memory mapping. The linux mmap has the same arguments as the BSD one, so I'd expect it to work the same.

It calls BSD mmap in the implementation on Linux. It calls CreateFileMapping and MapViewOfFile on Windows.

...

Works exactly like I'd expect it to.

...
Show me how you would code this. Here's the code that creates the shared file:

share_name = '/tmp/xyzzy' with open(share_name, 'wb') as f: f.write(b'hello')

Here's the code for the child:

with open(share_name, 'r+b') as f: share = mmap(f.fileno(), 0) share[:5] = b'gone\n'

Here's the code for the parent:

child = Process(target=proc) child.start() with open(share_name, mode='r+b') as f: share = mmap(f.fileno(), 0) while share[0] == ord('h'): sleep(1) print('main:', share.readline())

Here you are memory mapping a temporary file, not shared memory. On Linux, shared memory with mmap does not have a share_name. It has fileno -1. So go ahead and replace f.fileno() with -1 and see if it still works for you. This is how mmap is used for shared memory on Linux: shm = mmap.mmap(-1, 4096) os.fork() See how the fork comes after the mmap. Which means it must always be allocated in the parent process. That is why we need an implementation with System V IPC instead of mmap.

...

Yes, but we're not talking about multiprocessing.Queue. We're talking about mmap. multiprocessing.Queue doesn't use mmap. For that, you want to us multiprocessing.Value and multiprocessing.Array.

Pass multiprocessing.Value or multiprocessing.Array to multiprocessing.Queue and see what happens. And while you are at it, pass multiprocessing.Lock to multiprocessing.Queue and see what happens as well. Contemplate how we can pass an object with a lock as a message between two processes. Should we change the implementation? And then, look up the implementation for multiprocessing.Value and Array and see if (and how) they use mmap. Perhaps you just told me to use mmap instead of mmap. Sturla

Mike Meyer

10:14 p.m.

On Sun, 12 Feb 2012 15:15:46 +0100 Sturla Molden <sturla@molden.no> wrote:

...

No, I haven't. To me, the GIL is one of the minor reasons to avoid using threads in Python. I doubt that I've mentioned it at all. Given how much attention you pay to details, I no longer care about getting an answer to my question, as I suspect that it will have as much accuracy as that statement. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Sturla Molden

10:20 p.m.

Den 12.02.2012 23:14, skrev Mike Meyer:

...

My apologies, I was confusing you with Matt Joiner. Sturla

shibturn

1:52 p.m.

On 12/02/2012 3:46am, Sturla Molden wrote:

...

As Mike says, on Unix you can just create a file in /tmp to back an mmap. On Linux, posix mmaps created with shm_open() seem to be normal files on a tmpfs file system, usually /dev/shm. Since /tmp is also usually a tmpfs file system on Linux, I assume this whould be equivalent in terms of overhead. On Windows you can use the tagname argument of mmap.mmap(). Maybe a BinaryBlob wrapper class could be created which lets an mmap be "pickled by reference". Managing life time and reliable cleanup might be awkward though. If the pickle overhead is the problem you could try Connection.send_bytes() and Connection.recv_bytes(). I suppose Queue objects could grow put_bytes() and get_bytes() methods too. Or a BytesQueue class could be created.

...

Can you do this? No?

Then try the same thing with a lock (multiprocessing.Lock) or an event.

I have a patch (http://bugs.python.org/issue8713) to make multiprocessing on Unix work with fork+exec which has to do this because semaphores cannot be inherited across exec. Making sure all the named semaphores get removed if the program terminates abnormally is a bit awkward though. It could be modified to make them picklable in general. On Windows dealing with "named objects" is easier since they are refcounted by the operating system and deleted when no more processes have handles for them. If you make a feature request at bugs.python.org I might work on a patch. Cheers sbt

Sturla Molden

2:35 p.m.

Den 12.02.2012 14:52, skrev shibturn:

...

Mark did not use shm_open, he memory mapped from disk.

...

Cleaning up SysV ipc semaphores and shared memory is similar (semctl instead of shmctl to get refrerence count). And then we need a monkey patch for os._exit. Look at the Cython code here: http://dl.dropbox.com/u/12464039/sharedmem.zip Sturla

shibturn

3:20 p.m.

On 12/02/2012 2:35pm, Sturla Molden wrote:

...

Mark did not use shm_open, he memory mapped from disk.

But if his /tmp is a tmpfs file system (which it usually is on Linux) then I think it is entirely equivalent. Or he could create the file in /dev/shm instead. Below is Blob class which seems to work. Note that the process which created the blob needs to wait for the other process to unpickle it before allowing it to be garbage collected. import multiprocessing as mp from multiprocessing.util import Finalize, get_temp_dir import mmap, sys, os, itertools class Blob(object): _counter = itertools.count() def __init__(self, length, name=None): self.length = length if sys.platform == 'win32': if name is None: name = 'blob-%s-%d' % (os.getpid(), next(self._counter)) self.name = name self.mmap = mmap.mmap(-1, length, self.name) else: if name is None: self.name = '%s/blob-%s-%d' % (get_temp_dir(), os.getpid(), next(self._counter)) flags = os.O_RDWR | os.O_CREAT | os.O_EXCL else: self.name = name flags = os.O_RDWR fd = os.open(self.name, flags, 0o600) try: if name is None: os.ftruncate(fd, length) Finalize(self, os.unlink, (self.name,), exitpriority=0) self.mmap = mmap.mmap(fd, length) finally: os.close(fd) def __reduce__(self): return Blob, (self.length, self.name) def child(conn): b = Blob(20) b.mmap[:5] = "hello" conn.send(b) conn.recv() # wait for acknowledgement before # allowing garbage collection if __name__ == '__main__': conn, child_conn = mp.Pipe() p = mp.Process(target=child, args=(child_conn,)) p.start() b = conn.recv() conn.send(None) # acknowledge receipt print repr(b.mmap[:])

Sturla Molden

8:33 p.m.

Den 12.02.2012 16:20, skrev shibturn:

...

It seems that on Linux /tmp is backed by shared memory. Which sounds rather strange to a Windows user, as the raison d'etre for tempfiles is temporary storage space that goes beyond physial RAM. I've also read that the use of ftruncate in this context can result in SIGBUS.

...

I would look at kernel refcounts before unlinking. (But I am not that familiar with Linux.) Sturla

shibturn

12:31 a.m.

On 12/02/2012 8:33pm, Sturla Molden wrote:

...

It seems that on Linux /tmp is backed by shared memory.

Which sounds rather strange to a Windows user, as the raison d'etre for tempfiles is temporary storage space that goes beyond physial RAM.

In reality /tmp is backed by swap space, so physical RAM does not impose a limit. Anonymous mmaps are also backed by swap space.

...

I've also read that the use of ftruncate in this context can result in SIGBUS.

Isn't that if you truncate the file to a smaller size *after* it has been mapped. As far as I am aware, using ftruncate to set the length *before* it can be mapped for the first time is standard practice and harmless.

...

...
Below is Blob class which seems to work. Note that the process which created the blob needs to wait for the other process to unpickle it before allowing it to be garbage collected.

I would look at kernel refcounts before unlinking. (But I am not that familiar with Linux.)

Even if you have automatic refcounting like on Windows, you still need to cope with lifetime management issues. If you put an object on a queue it may be a long time before the target process will unpickle the object and increase its refcount, and you must not decref the object until it has, or else it will disappear. I don't know how to get the ref count for a file descriptor on Unix. (And posix shared memory does not seems to get a refcount either, even though System V shared memory does.) sbt

shibturn

12:42 a.m.

On 13/02/2012 12:31am, shibturn wrote:

...

Ah, on some Unixes ftruncate() limits the size of the file, but will not increase it. sbt

Mike Meyer

4:15 p.m.

On Sun, Feb 12, 2012 at 3:33 PM, Sturla Molden <sturla@molden.no> wrote:

...

That's what /tmp was created for on Unix as well. But we've since added virtual memory for that same purpose. Modern kernel virtual address spaces are bigger than disks, and the IO and VM subsystem buffer caches have similar performance, and may even share buffers. So the major difference between memory-backed and fs-backed /tmp is that an fs-backed one survives a reboot, which creates security issues on multiuser systems. In theory, you could create a file on a memory-backed /tmp that's bigger than any data structure your process can hold. But modern software tends to use /tmp for things that need to be shared between processes (unix-domain sockets, lock files, etc), and legacy software is usually quite happy with a few tens of megabytes on /tmp. So it's rather common for a systems per-process virtual address limit to be bigger than /tmp. <mike

Jesse Noller

February 2012

12:04 a.m.

On Feb 10, 2012, at 6:36 PM, Sturla Molden <sturla@molden.no> wrote:

...

Den 10.02.2012 22:15, skrev Mike Meyer:

...
In what way does the mmap module fail to provide your binary file interface? <mike

The short answer is that BSD mmap creates an anonymous kernel object. When working with multiprocessing for a while, one comes to the conclusion that we really need named kernel objects.

Here are two simple fail cases for anonymous kernel objects:

- Process A spawns/forks process B. - Process B creates an object, one of the attributes is a lock. - Fail: This object cannot be communicated back to process A. B inherits from A, A does not inherit from B.

- Process A spawns/forks a process pool. - Process A creates an object, one of the attributes is a lock. - Fail: This object cannot be communicated to the pool. They do not inherit new handles from A after they are started.

All of multiprocessing's IPC classes suffer from this!

Solution:

Use named kernel objects for IPC, pickle the name.

I made a shared memory array for NumPy that workes like this -- implemented by memory mapping from the paging file on Windows, System V IPC on Linux. Underneath is an extension class that allocates a shared memory buffer. When pickled it encodes the kernel name, not its content, and unpickling opens the object given its name.

There is another drawback too:

The speed of pickle. For example, sharing NumPy arrays with pickle is not faster with shared memory. The overhead from pickle completely dominate the time needed for IPC . That is why I want a type specialized or a binary channel. Making this from the named shared memory class I already have is a no-brainer.

So that is my other objection against multiprocessing.

That is:

1. Object sharing by handle inheritance fails when kernel objects must be passed back to the parent process or to a process pool. We need IPC objects that have a name in the kernel, so they can be created and shared in retrospect.

2. IPC with multiprocessing is too slow due to pickle. We need something that does not use pickle. (E.g. shared memory, but not by means of mmap.) It might be that the pipe or socket in multiprocessing will do this (I have not looked at it carefully enough), but they still don't have

Proof of concept:

http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip

Dependency on Cython and NumPy should probably be removed, never mind that. Important part is this:

sharedmemory_sysv.pyx (Linux) sharedmemory_win.pyx and ntqueryobject.c (Windows)

Finally, I'd like to say that I think Python's standard lib should support high-performance asynchronous I/O for concurrency. That is not poll/select (on Windows it does not even work properly). Rather, I want IOCP on Windows, epoll on Linux, and kqueue on Mac. (Yes I know about twisted.) There should also be a requirement that it works with multiprocessing. E.g. if we open a process pool, the processes should be able to use the same IOCP. In other words some highly scalable asynchronous I/O that works with multiprocessing.

So ... As far as I am concerned, the only thing worth keeping in multipricessing is multiprocessing.Process and multiprocessing.Pool. The rest doesn't do what we want.

Sturla

Sturla Molden

3:10 p.m.

Den 11.02.2012 00:36, skrev Sturla Molden:

...

Proof of concept:

http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip

Sorry, wrong version. Use this instead: http://dl.dropbox.com/u/12464039/sharedmem.zip Sturla

Antoine Pitrou

3:27 p.m.

On Sat, 11 Feb 2012 00:36:15 +0100 Sturla Molden <sturla@molden.no> wrote:

...

This is not trivial (especially the IOCP part, if I consider the amount of code Twisted has for that).

...

Ouch. Regards Antoine.

Mike Meyer

1:52 a.m.

pwdOn Sat, 11 Feb 2012 00:36:15 +0100 Sturla Molden <sturla@molden.no> wrote:

...

Den 10.02.2012 22:15, skrev Mike Meyer:

...
In what way does the mmap module fail to provide your binary file interface? <mike The short answer is that BSD mmap creates an anonymous kernel object.

First, I didn't ask about "BSD mmap", I asked about the "mmap module". They aren't the same thing.

...

When working with multiprocessing for a while, one comes to the conclusion that we really need named kernel objects.

...

Here are two simple fail cases for anonymous kernel objects:

[elided, since the restriction doesn't exist]

...

All of multiprocessing's IPC classes suffer from this!

Some of them may. The one I asked about doesn't.

...

Solution:

Use named kernel objects for IPC, pickle the name.

You don't need to pickle the name if you use mmap's native name system - it's just a string.

...

There is another drawback too:

The speed of pickle. For example, sharing NumPy arrays with pickle is not faster with shared memory. The overhead from pickle completely dominate the time needed for IPC . That is why I want a type specialized or a binary channel. Making this from the named shared memory class I already have is a no-brainer.

...

So that is my other objection against multiprocessing.

1. Object sharing by handle inheritance fails when kernel objects must be passed back to the parent process or to a process pool. We need IPC objects that have a name in the kernel, so they can be created and shared in retrospect.

We've already got that one. You just need to learn how to use it.

...

2. IPC with multiprocessing is too slow due to pickle. We need something that does not use pickle. (E.g. shared memory, but not by means of mmap.) It might be that the pipe or socket in multiprocessing will do this (I have not looked at it carefully enough), but they still don't have

Sturla Molden

3:46 a.m.

Den 12.02.2012 02:52, skrev Mike Meyer:
> First, I didn't ask about "BSD mmap", I asked about the "mmap module". 
> They aren't the same thing. 

Take a look at the implementation.

>> When working with multiprocessing for a while, one comes to the
>> conclusion that we really need named kernel objects.
> And both the BSD mmap (at least in recent systems) and the mmap module
> provide objects with names in the file system space. IIUC, while there
> are systems that won't let you create anonymous objects (like early
> versions of the mmap module), there aren't any - at least any longer -
> that won't let you create named objects.

Sure, you can memory map named files. You can even memory map from 
/dev/shm on a system that supports it, if you are willing to reserve 
some RAM for ramdisk.

But apart from that, show me how you would use the mmap module to make 
named shared memory on Linux or Windows. No, memory mapping file object 
-1 or 0 don't count, you get an anonymous memory mapping.

Here is a task for you to try:

1. start a process
2. in the new process, create some shared memory (use the mmap module)
3. make the parent process get access to it (should be easy, right?)

Can you do this? No?

Then try the same thing with a lock (multiprocessing.Lock) or an event.

Show me how you would code this.

>
> >  Use named kernel objects for IPC, pickle the name.
> You don't need to pickle the name if you use mmap's native name system
> - it's just a string.

Sure, multiprocessing does not pickle strings objects. Or whatever. Have 
you ever looked at the code?



> Since can use pickle, you're only dealing with small amounts of
> data.

What on earth are you talking about?

Every object passed in the "args" keyword argument to 
multiprocessing.Process is pickled. Same thing for any object you pass 
to multiprocessing.Queue.

Look at the code.



Sturla

Mike Meyer

8:02 a.m.

On Sun, 12 Feb 2012 04:46:00 +0100
Sturla Molden <sturla@molden.no> wrote:

> Den 12.02.2012 02:52, skrev Mike Meyer:
> > First, I didn't ask about "BSD mmap", I asked about the "mmap module". 
> > They aren't the same thing. 
> Take a look at the implementation.

True, but we're talking about an API, not a specific implementation.

> >> When working with multiprocessing for a while, one comes to the
> >> conclusion that we really need named kernel objects.
> > And both the BSD mmap (at least in recent systems) and the mmap module
> > provide objects with names in the file system space. IIUC, while there
> > are systems that won't let you create anonymous objects (like early
> > versions of the mmap module), there aren't any - at least any longer -
> > that won't let you create named objects.
> Sure, you can memory map named files. You can even memory map from 
> /dev/shm on a system that supports it, if you are willing to reserve 
> some RAM for ramdisk.

And that's *not* the anonymous kernel object you complained about
getting from mmap.

> But apart from that, show me how you would use the mmap module to make 
> named shared memory on Linux or Windows. No, memory mapping file object 
> -1 or 0 don't count, you get an anonymous memory mapping.

The linux mmap has the same arguments as the BSD one, so I'd expect it
to work the same. I expect that the Python core will have made the
semantics work properly on Windows, but don't really care, and don't
have a Windows system to test it on. And that's why I'm talking about
the API, not the implementation.

> Here is a task for you to try:
> 
> 1. start a process
> 2. in the new process, create some shared memory (use the mmap module)
> 3. make the parent process get access to it (should be easy, right?)
> Can you do this? No?

Works exactly like I'd expect it to.

> Show me how you would code this.

Here's the code that creates the shared file:

    share_name = '/tmp/xyzzy'
    with open(share_name, 'wb') as f:
        f.write(b'hello')

Here's the code for the child:

    with open(share_name, 'r+b') as f:
        share = mmap(f.fileno(), 0)
        share[:5] = b'gone\n'

Here's the code for the parent:

    child = Process(target=proc)
    child.start()
    with open(share_name, mode='r+b') as f:
        share = mmap(f.fileno(), 0)
        while share[0] == ord('h'):
            sleep(1)
        print('main:', share.readline())

> > >  Use named kernel objects for IPC, pickle the name.
> > You don't need to pickle the name if you use mmap's native name system
> > - it's just a string.
> Sure, multiprocessing does not pickle strings objects. Or whatever. Have 
> you ever looked at the code?

I didn't say multiprocessing wouldn't pickle the name, *or* anything
else about the multiprocessing module. I said *you* didn't need to
pickle it. And I didn't. Did you read what I wrote?
  
> Every object passed in the "args" keyword argument to 
> multiprocessing.Process is pickled. Same thing for any object you pass 
> to multiprocessing.Queue.

Yes, but we're not talking about multiprocessing.Queue. We're talking
about mmap. multiprocessing.Queue doesn't use mmap. For that, you want
to us multiprocessing.Value and multiprocessing.Array.

> Look at the code.

Look at the text.

     <mike
-- 
Mike Meyer <mwm@mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Sturla Molden

February 2012

8:15 a.m.

Den 12.02.2012 09:02, skrev Mike Meyer:

...

True, but we're talking about an API, not a specific implementation.

You have been complaining about the GIL which is a specific implementation. I am talking about how multiprocessing actually works, i.e. implementation.

...

...
But apart from that, show me how you would use the mmap module to make named shared memory on Linux or Windows. No, memory mapping file object -1 or 0 don't count, you get an anonymous memory mapping. The linux mmap has the same arguments as the BSD one, so I'd expect it to work the same.

It calls BSD mmap in the implementation on Linux. It calls CreateFileMapping and MapViewOfFile on Windows.

...

Works exactly like I'd expect it to.

...
Show me how you would code this. Here's the code that creates the shared file:

share_name = '/tmp/xyzzy' with open(share_name, 'wb') as f: f.write(b'hello')

Here's the code for the child:

with open(share_name, 'r+b') as f: share = mmap(f.fileno(), 0) share[:5] = b'gone\n'

Here's the code for the parent:

child = Process(target=proc) child.start() with open(share_name, mode='r+b') as f: share = mmap(f.fileno(), 0) while share[0] == ord('h'): sleep(1) print('main:', share.readline())

...

Yes, but we're not talking about multiprocessing.Queue. We're talking about mmap. multiprocessing.Queue doesn't use mmap. For that, you want to us multiprocessing.Value and multiprocessing.Array.

Mike Meyer

4:14 p.m.

On Sun, 12 Feb 2012 15:15:46 +0100 Sturla Molden <sturla@molden.no> wrote:

...

Sturla Molden

4:20 p.m.

Den 12.02.2012 23:14, skrev Mike Meyer:

...

My apologies, I was confusing you with Matt Joiner. Sturla

shibturn

7:52 a.m.

On 12/02/2012 3:46am, Sturla Molden wrote:

...

Can you do this? No?

Then try the same thing with a lock (multiprocessing.Lock) or an event.

Sturla Molden

8:35 a.m.

Den 12.02.2012 14:52, skrev shibturn:

...

Mark did not use shm_open, he memory mapped from disk.

...

shibturn

9:20 a.m.

On 12/02/2012 2:35pm, Sturla Molden wrote:

...

Mark did not use shm_open, he memory mapped from disk.

Sturla Molden

February 2012

8:33 p.m.

Den 12.02.2012 16:20, skrev shibturn:

...

I would look at kernel refcounts before unlinking. (But I am not that familiar with Linux.) Sturla

shibturn

12:31 a.m.

On 12/02/2012 8:33pm, Sturla Molden wrote:

...

It seems that on Linux /tmp is backed by shared memory.

Which sounds rather strange to a Windows user, as the raison d'etre for tempfiles is temporary storage space that goes beyond physial RAM.

In reality /tmp is backed by swap space, so physical RAM does not impose a limit. Anonymous mmaps are also backed by swap space.

...

I've also read that the use of ftruncate in this context can result in SIGBUS.

...

...
Below is Blob class which seems to work. Note that the process which created the blob needs to wait for the other process to unpickle it before allowing it to be garbage collected.

I would look at kernel refcounts before unlinking. (But I am not that familiar with Linux.)

shibturn

12:42 a.m.

On 13/02/2012 12:31am, shibturn wrote:

...

Ah, on some Unixes ftruncate() limits the size of the file, but will not increase it. sbt

Mike Meyer

4:15 p.m.

On Sun, Feb 12, 2012 at 3:33 PM, Sturla Molden <sturla@molden.no> wrote:

...

4783

Age (days ago)

4786

Last active (days ago)

List overview

Download

16 comments

5 participants

participants (5)

Antoine Pitrou
Jesse Noller
Mike Meyer
shibturn
Sturla Molden

multiprocessing IPC

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

Sturla Molden

tags

participants (5)