Re: [Python-Dev] [Python-checkins] cpython: Switch subprocess stdin to a socketpair, attempting to fix issue #19293 (AIX

Hi, Would it be possible to use os.pipe() on all OSes except AIX? Pipes and socket pairs may have minor differences, but some applications may rely on these minor differences. For example, is the buffer size the same? For example, in test.support, we have two constants: PIPE_MAX_SIZE (4 MB) and SOCK_MAX_SIZE (16 MB). Victor 2013/10/22 guido.van.rossum <python-checkins@python.org>:
http://hg.python.org/cpython/rev/2a0bda8d283d changeset: 86557:2a0bda8d283d user: Guido van Rossum <guido@dropbox.com> date: Mon Oct 21 20:37:14 2013 -0700 summary: Switch subprocess stdin to a socketpair, attempting to fix issue #19293 (AIX hang).
files: Lib/asyncio/unix_events.py | 29 +++++++++- Lib/test/test_asyncio/test_unix_events.py | 7 ++ 2 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/Lib/asyncio/unix_events.py b/Lib/asyncio/unix_events.py --- a/Lib/asyncio/unix_events.py +++ b/Lib/asyncio/unix_events.py if stdin == subprocess.PIPE: self._pipes[STDIN] = None + # Use a socket pair for stdin, since not all platforms + # support selecting read events on the write end of a + # socket (which we use in order to detect closing of the + # other end). Notably this is needed on AIX, and works + # just fine on other platforms. + stdin, stdin_w = self._loop._socketpair() if stdout == subprocess.PIPE: self._pipes[STDOUT] = None if stderr == subprocess.PIPE:

Le Tue, 22 Oct 2013 10:54:03 +0200, Victor Stinner <victor.stinner@gmail.com> a écrit :
Hi,
Would it be possible to use os.pipe() on all OSes except AIX?
Pipes and socket pairs may have minor differences, but some applications may rely on these minor differences. For example, is the buffer size the same? For example, in test.support, we have two constants: PIPE_MAX_SIZE (4 MB) and SOCK_MAX_SIZE (16 MB).
For the record, pipe I/O seems a little faster than socket I/O under Linux: $ ./python -m timeit -s "import os, socket; a,b = socket.socketpair(); r=a.fileno(); w=b.fileno(); x=b'x'*1000" "os.write(w, x); os.read(r, 1000)" 1000000 loops, best of 3: 1.1 usec per loop $ ./python -m timeit -s "import os, socket; a,b = socket.socketpair(); x=b'x'*1000" "a.sendall(x); b.recv(1000)" 1000000 loops, best of 3: 1.02 usec per loop $ ./python -m timeit -s "import os; r, w = os.pipe(); x=b'x'*1000" "os.write(w, x); os.read(r, 1000)" 1000000 loops, best of 3: 0.82 usec per loop Regards Antoine.
Victor
2013/10/22 guido.van.rossum <python-checkins@python.org>:
http://hg.python.org/cpython/rev/2a0bda8d283d changeset: 86557:2a0bda8d283d user: Guido van Rossum <guido@dropbox.com> date: Mon Oct 21 20:37:14 2013 -0700 summary: Switch subprocess stdin to a socketpair, attempting to fix issue #19293 (AIX hang).
files: Lib/asyncio/unix_events.py | 29 +++++++++- Lib/test/test_asyncio/test_unix_events.py | 7 ++ 2 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/Lib/asyncio/unix_events.py b/Lib/asyncio/unix_events.py --- a/Lib/asyncio/unix_events.py +++ b/Lib/asyncio/unix_events.py if stdin == subprocess.PIPE: self._pipes[STDIN] = None + # Use a socket pair for stdin, since not all platforms + # support selecting read events on the write end of a + # socket (which we use in order to detect closing of the + # other end). Notably this is needed on AIX, and works + # just fine on other platforms. + stdin, stdin_w = self._loop._socketpair() if stdout == subprocess.PIPE: self._pipes[STDOUT] = None if stderr == subprocess.PIPE:

"For the record, pipe I/O seems a little faster than socket I/O under Linux" In and old (2006) email on LKML (Linux kernel), I read: "as far as I know pipe() is now much faster than socketpair(), because pipe() uses the zero-copy mechanism." https://lkml.org/lkml/2006/9/24/121 On Linux, splice() can also be used with pipes for zero-copy operations. I don't know if splice() works with socketpair(). Well, I don't think that Python uses splice() now, but it may be interesting to use it. Or sendfile() uses it maybe internally? Victor

Le Wed, 23 Oct 2013 13:53:40 +0200, Victor Stinner <victor.stinner@gmail.com> a écrit :
"For the record, pipe I/O seems a little faster than socket I/O under Linux"
In and old (2006) email on LKML (Linux kernel), I read: "as far as I know pipe() is now much faster than socketpair(), because pipe() uses the zero-copy mechanism." https://lkml.org/lkml/2006/9/24/121
On Linux, splice() can also be used with pipes for zero-copy operations. I don't know if splice() works with socketpair().
splice() only works with pipes. socketpair() returns sockets, which are not pipes :-)
Well, I don't think that Python uses splice() now, but it may be interesting to use it.
Where do you want to use it? Regards Antoine.

For the record, pipe I/O seems a little faster than socket I/O under Linux:
$ ./python -m timeit -s "import os, socket; a,b = socket.socketpair(); r=a.fileno(); w=b.fileno(); x=b'x'*1000" "os.write(w, x); os.read(r, 1000)" 1000000 loops, best of 3: 1.1 usec per loop
$ ./python -m timeit -s "import os, socket; a,b = socket.socketpair(); x=b'x'*1000" "a.sendall(x); b.recv(1000)" 1000000 loops, best of 3: 1.02 usec per loop
$ ./python -m timeit -s "import os; r, w = os.pipe(); x=b'x'*1000" "os.write(w, x); os.read(r, 1000)" 1000000 loops, best of 3: 0.82 usec per loop
That's a raw write()/read() benchmark, but it's not taking something important into account: pipes/socket are usually used to communicate between concurrently running processes. And in this case, an important factor is the pipe/socket buffer size: the smaller it is, the more context switches (due to blocking writes/reads) you'll get, which greatly decreases throughput. And by default, Unix sockets have large buffers than pipes (between 4K and 64K for pipes depending on the OS): I wrote a quick benchmark forking a child process, with the parent writing data through the pipe, and waiting for the child to read it all. here are the results (on Linux): # time python /tmp/test.py pipe real 0m2.479s user 0m1.344s sys 0m1.860s # time python /tmp/test.py socketpair real 0m1.454s user 0m1.242s sys 0m1.234s So socketpair is actually faster. But as noted by Victor, there a slight differences between pipes and sockets I can think of: - pipes guarantee write atomicity if less than PIPE_BUF is written, which is not the case for sockets - more annoying: in subprocess, the pipes are not set non-blocking: after a select()/poll() returns a FD write-ready, we write less than PIPE_BUF at a time to avoid blocking: this likely wouldn't work with a socketpair But this patch doesn't touch subprocess itself, and the FDs is only used by asyncio, which sets them non-blocking: so this could only be an issue for the spawned process, if it does rely on the two pipe-specific behaviors above. OTOH, having a unique implementation on all platforms makes sense, and I don't know if it'll actually be a problem in practice, we we could ship as-is and wait until someone complains ;-) cf
participants (3)
-
Antoine Pitrou
-
Charles-François Natali
-
Victor Stinner