[Python-Dev] [Python-checkins] cpython: Switch subprocess stdin to a socketpair, attempting to fix issue #19293 (AIX

Charles-François Natali cf.natali at gmail.com
Wed Oct 23 20:25:47 CEST 2013


> For the record, pipe I/O seems a little faster than socket I/O under
> Linux:
>
> $ ./python -m timeit -s "import os, socket; a,b = socket.socketpair(); r=a.fileno(); w=b.fileno(); x=b'x'*1000" "os.write(w, x); os.read(r, 1000)"
> 1000000 loops, best of 3: 1.1 usec per loop
>
> $ ./python -m timeit -s "import os, socket; a,b = socket.socketpair(); x=b'x'*1000"
> "a.sendall(x); b.recv(1000)"
> 1000000 loops, best of 3: 1.02 usec per loop
>
> $ ./python -m timeit -s "import os; r, w = os.pipe(); x=b'x'*1000" "os.write(w, x); os.read(r, 1000)"
> 1000000 loops, best of 3: 0.82 usec per loop

That's a raw write()/read() benchmark, but it's not taking something
important into account: pipes/socket are usually used to communicate
between concurrently running processes. And in this case, an important
factor is the pipe/socket buffer size: the smaller it is, the more
context switches (due to blocking writes/reads) you'll get, which
greatly decreases throughput.
And by default, Unix sockets have large buffers than pipes (between 4K
and 64K for pipes depending on the OS):

I wrote a quick benchmark forking a child process, with the parent
writing data through the pipe, and waiting for the child to read it
all. here are the results (on Linux):

# time python /tmp/test.py pipe

real    0m2.479s
user    0m1.344s
sys     0m1.860s

# time python /tmp/test.py socketpair

real    0m1.454s
user    0m1.242s
sys     0m1.234s

So socketpair is actually faster.

But as noted by Victor, there a slight differences between pipes and
sockets I can think of:
- pipes guarantee write atomicity if less than PIPE_BUF is written,
which is not the case for sockets
- more annoying: in subprocess, the pipes are not set non-blocking:
after a select()/poll() returns a FD write-ready, we write less than
PIPE_BUF at a time to avoid blocking: this likely wouldn't work with a
socketpair

But this patch doesn't touch subprocess itself, and the FDs is only
used by asyncio, which sets them non-blocking: so this could only be
an issue for the spawned process, if it does rely on the two
pipe-specific behaviors above.

OTOH, having a unique implementation on all platforms makes sense, and
I don't know if it'll actually be a problem in practice, we we could
ship as-is and wait until someone complains ;-)

cf


More information about the Python-Dev mailing list