Python 2.3b1 has 20% slower networking?

The "we always wrap socket objects with python class" change seems to have slowed down networking on Linux (and presumably other platforms where socket objects used to be unwrapped.) Moshe Zadka ran some benchmarks on Linux (2.4.9 - a redhat machine at work probably) with 2.2 and 2.3b1 using Demos/sockets/throughput.py. For count of 1000: 2.3 server, 2.3 client: Throughput: 13556.811 K/sec. 2.3 server, 2.2 client: Throughput: 24917.862 K/sec. 2.2 server, 2.2 client: Throughput: 29838.491 K/sec. 10,000: 2.3 server, 2.3 client: Throughput: 35994.749 K/sec. 2.3 server, 2.2 client: Throughput: 34398.085 K/sec. 2.2 server, 2.2 client: Throughput: 49488.916 K/sec. 50,000: 2.3 server, 2.3 client: Throughput: 39002.538 K/sec. 2.3 server, 2.2 client: Throughput: 48064.785 K/sec. 2.2 server, 2.2 client: Throughput: 59799.672 K/sec. On a 2.3a2 I have I did "socket.socket = socket._socketobject", and got a 20% slowdown compared to 2.2 on throughput. (2.3a2 without this change is the same speed as 2.2). Can other people do some tests to verify these numbers? If this slowdown is confirmed, it is really not acceptable, since the change seems to have been made only to support making timeout sockets slightly easier to use. Why should everyone have to pay a speed penalty just so a minority of people can skip calling a "socket.installtimeoutsupport()" at the beginning of their program? it's just one line of code they'd need to add. In real programs the speed drop would probably be much less pronounced, although I bet this slows down e.g. Anthony Baxter's portforwarder quite a bit. If Python 2.3 is released without fixing this Twisted will probably monkeypatch the socket module so that we can get full performance, since we have our own (unavoidable) layers of Python indirection :) -- Itamar Shtull-Trauring http://itamarst.org/ http://www.zoteca.com -- Python & Twisted consulting

On 27 Apr 2003 22:19:44 +0200 martin@v.loewis.de (Martin v. Löwis) wrote:
Can other people do some tests to verify these numbers?
For that, it would be good if Moshe's test procedure was published.
On Debian, you can do: cd /usr/share/doc/python2.2/examples/Demos/sockets.py python2.2 throughput.py -s & python2.2 throughput.py -c 10000 localhost and try with python2.3 and different numbers other than 10000. On non-Debian platforms/packages it's wherever you have the python examples installed. -- Itamar Shtull-Trauring http://itamarst.org/ http://www.zoteca.com -- Python & Twisted consulting

[Itamar Shtull-Trauring]
So running Demo/sockets/throughput.py with the -c 10000 argument I get under OS X: * Python 2.2.2: 7976.756k K/sec * CVS Python (compiled on April 18): 2772.97 K/sec Now I put no great effort into steriliziing my system so that nothing else was running so take these numbers with a grain of salt. -Brett

I can also reproduce the slowdown. Measured on a Redhat 9 machine, python-2.2.2-26.i386.rpm vs python 2.3b1 compiled with default options. 700MHz Pentium III in a laptop. best of 3 runs. Count of 100000. Running over the loopback device. Sentence fragments. Server Client Throughput Speed 2.2 2.2 53520.4 100.00% 2.2 2.3b1 43726.28 81.70% 2.3b1 2.2 43032.06 80.40% 2.3b1 2.3b1 38283.78 71.53% System load was low at the time, though I had various apps running. I also ran the test over my 802.11b wireless setup: Server Client Throughput Speed 2.2 2.2 639.16 100.00% 2.3b1 2.2 639.07 99.98% (client was a 350MHz machine with various programs running) that is, when running over a relatively slow link (theoretically, 11mbps) the slowdown is not measurable. However, I don't think that this really decreases the importance of this performance regression. Jeff

I'm guessing that the slowdown comes from the fact that calling a method like recv() on the wrapper object is now a Python method which calls the C method on the wrapped object. I wonder if the slowdown can't be easily repaired by changing the wrapper class to copy the relevant methods to instance variables. It would be even nicer to use subclassing instead of a wrapper object. I vaguely recall that I tried this before but couldn't figure out how to do it, but I've got a feeling that it ought to be doable -- after all the C socket object has separate __new__ and __init__ methods. I hope someone can take this ball and submit a patch -- it would indeed be a shame to have to live with the slowdown (even if it only shows up when using the loopback device) or to have a practice of monkey patching socket.py. (BTW instead of monkey-patching socket.py, it might be easier to write "import _socket as socket".) --Guido van Rossum (home page: http://www.python.org/~guido/)

Itamar> If this slowdown is confirmed, it is really not acceptable, Itamar> since the change seems to have been made only to support making Itamar> timeout sockets slightly easier to use. It was done to support making timeout sockets work properly. As they existed previously, timeout sockets wouldn't work with protocols which would most likely use them: higher level modules such as httplib, which call sock.makefile(), then call readlines?() on the resulting file object. Itamar> Why should everyone have to pay a speed penalty just so a Itamar> minority of people can skip calling a Itamar> "socket.installtimeoutsupport()" at the beginning of their Itamar> program? it's just one line of code they'd need to add. I think it would be easier for the minority of programs that care about the 20% performance loss to simply set import socket, _socket socket.socket = socket.SocketType = _socket.socket I don't know about you, but fast and incorrect don't help me much. Feel free to submit a patch which improves performance but maintains proper behavior in the face of timeouts (that is, allows test_urllibnet to still work correctly). Skip

For whatever reason, it actually doesn't seem to matter. Python2.2 seems to clock in about 10% slower (in throughput and connections/second) than the same code running under 2.3a1. Upgrading to current-CVS, I see almost no difference between 2.3a1 and current-CVS (maybe 5% improvement). (FWIW, python2.1 is almost 25% slower than current-cvs!) The code in question is pythondirector, a pure-python TCP loadbalancer, http://pythondirector.sf.net/. In this case all the above were run with Twisted 1.0.3. All tests were run on my laptop via the loopback interface. Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.

On Wed, 30 Apr 2003 17:29:08 +1000 Anthony Baxter <anthony@interlink.com.au> wrote:
For whatever reason, it actually doesn't seem to matter.
OK, great. And thanks to the python-dev team for fixing the issue in CVS so quickly. -- Itamar Shtull-Trauring http://itamarst.org/ http://www.zoteca.com -- Python & Twisted consulting

On 27 Apr 2003 22:19:44 +0200 martin@v.loewis.de (Martin v. Löwis) wrote:
Can other people do some tests to verify these numbers?
For that, it would be good if Moshe's test procedure was published.
On Debian, you can do: cd /usr/share/doc/python2.2/examples/Demos/sockets.py python2.2 throughput.py -s & python2.2 throughput.py -c 10000 localhost and try with python2.3 and different numbers other than 10000. On non-Debian platforms/packages it's wherever you have the python examples installed. -- Itamar Shtull-Trauring http://itamarst.org/ http://www.zoteca.com -- Python & Twisted consulting

[Itamar Shtull-Trauring]
So running Demo/sockets/throughput.py with the -c 10000 argument I get under OS X: * Python 2.2.2: 7976.756k K/sec * CVS Python (compiled on April 18): 2772.97 K/sec Now I put no great effort into steriliziing my system so that nothing else was running so take these numbers with a grain of salt. -Brett

I can also reproduce the slowdown. Measured on a Redhat 9 machine, python-2.2.2-26.i386.rpm vs python 2.3b1 compiled with default options. 700MHz Pentium III in a laptop. best of 3 runs. Count of 100000. Running over the loopback device. Sentence fragments. Server Client Throughput Speed 2.2 2.2 53520.4 100.00% 2.2 2.3b1 43726.28 81.70% 2.3b1 2.2 43032.06 80.40% 2.3b1 2.3b1 38283.78 71.53% System load was low at the time, though I had various apps running. I also ran the test over my 802.11b wireless setup: Server Client Throughput Speed 2.2 2.2 639.16 100.00% 2.3b1 2.2 639.07 99.98% (client was a 350MHz machine with various programs running) that is, when running over a relatively slow link (theoretically, 11mbps) the slowdown is not measurable. However, I don't think that this really decreases the importance of this performance regression. Jeff

I'm guessing that the slowdown comes from the fact that calling a method like recv() on the wrapper object is now a Python method which calls the C method on the wrapped object. I wonder if the slowdown can't be easily repaired by changing the wrapper class to copy the relevant methods to instance variables. It would be even nicer to use subclassing instead of a wrapper object. I vaguely recall that I tried this before but couldn't figure out how to do it, but I've got a feeling that it ought to be doable -- after all the C socket object has separate __new__ and __init__ methods. I hope someone can take this ball and submit a patch -- it would indeed be a shame to have to live with the slowdown (even if it only shows up when using the loopback device) or to have a practice of monkey patching socket.py. (BTW instead of monkey-patching socket.py, it might be easier to write "import _socket as socket".) --Guido van Rossum (home page: http://www.python.org/~guido/)

Itamar> If this slowdown is confirmed, it is really not acceptable, Itamar> since the change seems to have been made only to support making Itamar> timeout sockets slightly easier to use. It was done to support making timeout sockets work properly. As they existed previously, timeout sockets wouldn't work with protocols which would most likely use them: higher level modules such as httplib, which call sock.makefile(), then call readlines?() on the resulting file object. Itamar> Why should everyone have to pay a speed penalty just so a Itamar> minority of people can skip calling a Itamar> "socket.installtimeoutsupport()" at the beginning of their Itamar> program? it's just one line of code they'd need to add. I think it would be easier for the minority of programs that care about the 20% performance loss to simply set import socket, _socket socket.socket = socket.SocketType = _socket.socket I don't know about you, but fast and incorrect don't help me much. Feel free to submit a patch which improves performance but maintains proper behavior in the face of timeouts (that is, allows test_urllibnet to still work correctly). Skip

For whatever reason, it actually doesn't seem to matter. Python2.2 seems to clock in about 10% slower (in throughput and connections/second) than the same code running under 2.3a1. Upgrading to current-CVS, I see almost no difference between 2.3a1 and current-CVS (maybe 5% improvement). (FWIW, python2.1 is almost 25% slower than current-cvs!) The code in question is pythondirector, a pure-python TCP loadbalancer, http://pythondirector.sf.net/. In this case all the above were run with Twisted 1.0.3. All tests were run on my laptop via the loopback interface. Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.

On Wed, 30 Apr 2003 17:29:08 +1000 Anthony Baxter <anthony@interlink.com.au> wrote:
For whatever reason, it actually doesn't seem to matter.
OK, great. And thanks to the python-dev team for fixing the issue in CVS so quickly. -- Itamar Shtull-Trauring http://itamarst.org/ http://www.zoteca.com -- Python & Twisted consulting
participants (7)
-
Anthony Baxter
-
Brett Cannon
-
Guido van Rossum
-
Itamar Shtull-Trauring
-
Jeff Epler
-
martin@v.loewis.de
-
Skip Montanaro