Re: Python-Dev digest, Vol 1 #3221 - 4 msgs
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday, April 28, 2003, at 11:00 AM, python-dev-request@python.org wrote:
Itamar> If this slowdown is confirmed, it is really not acceptable, Itamar> since the change seems to have been made only to support making Itamar> timeout sockets slightly easier to use.
It was done to support making timeout sockets work properly. As they existed previously, timeout sockets wouldn't work with protocols which would most likely use them: higher level modules such as httplib, which call sock.makefile(), then call readlines?() on the resulting file object.
Clearly this is a flaw in httplib's design. Perhaps one should be able to pass in a socket or file factory? That would allow speaking HTTP over non-TCP transports or through something like a SOCKS proxy, which is arguably a good thing. Do you want to add SOCKS support by adding another wrapper around the socket module as well? How about a python software firewall? Pretty soon our "correct" socket module will have 20 performance-destroying wrappers around it in order to work around deficiencies in the interfaces of some programs which use sockets. httplib is importing a module where passing a factory function is the correct thing to do. At first it looks like you can parameterize it by hacking up a module, but you can only do that once or twice before the design problem really becomes pressing. The socket module is not a high-level interface to networking. Attempting to make it into one will harm its utility as a low-level interface that good high-level interfaces can be built on top of.
Itamar> Why should everyone have to pay a speed penalty just so a Itamar> minority of people can skip calling a Itamar> "socket.installtimeoutsupport()" at the beginning of their Itamar> program? it's just one line of code they'd need to add.
I think it would be easier for the minority of programs that care about the 20% performance loss to simply set
I think this should be in the release notes for 2.3. "Python is 10% faster, unless you use sockets, in which case it is much, much slower. Do the following in order to regain lost performance and retain the same semantics:" I anticipate that more than just Twisted will want to monkey-patch the module. (A 20% drop in throughput is a significant issue to more than an eclectic audience.) If you're not going to fix this bug, maybe we could have a "socket.monkeypatch()" method which would prevent different systems from stepping on each other when they do it?
I don't know about you, but fast and incorrect don't help me much.
Since when is the behavior of the socket module "incorrect"? If anything the interface to "timeout sockets" is incorrect, because BSD sockets do not in fact support timeouts. The interface is doing a bunch of things behind the user's back which would be better done another way, for example, with actually asynchronous networking. It's pretty likely that there is some obscure corner-case that the select() in timeout sockets doesn't catch. From a brief glance, internal_select ignores error return values, and nothing checks its errno before making another socket call. If I remember correctly, that means that if select gets an EINTR, the following call to accept() or recv() or what-have-you may very well block. Of course, since the socket is in non-blocking mode at this point, that means that Python will raise an exception on the EAGAIN EWOULDBLOCK error. This is pretty hard to write a test for. I could be wrong about this particular error, but in general if one wishes to be pedantic about "correctness", one must first check the result codes from one's C system calls.
Feel free to submit a patch which improves performance but maintains proper behavior in the face of timeouts (that is, allows test_urllibnet to still work correctly).
Why is the Python development team introducing bugs into Python and then expecting the user community to fix things that used to work? I could understand not wanting to put a lot of effort into correcting obscure or difficult-to-find performance problems that only a few people care about, but the obvious thing to do in this case is simply to change the default behavior. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (Darwin) iD8DBQE+rZPbvVGR4uSOE2wRAhZVAKCjWkl1NSr8bC1DGcbvhKwL4GZ9+ACeO2cJ FNU17XosCZxRTVRF/wIkLys= =GJ3H -----END PGP SIGNATURE-----
I think this should be in the release notes for 2.3. "Python is 10% faster, unless you use sockets, in which case it is much, much slower. Do the following in order to regain lost performance and retain the same semantics:"
That is total bullshit, Glyph, and you know it. --Guido van Rossum (home page: http://www.python.org/~guido/)
Why is the Python development team introducing bugs into Python and then expecting the user community to fix things that used to work?
I resent your rhetoric, Glyph. Had you read the rest of this thread, you would have seen that the performance regression only happens for sending data at maximum speed over the loopback device, and is negligeable when receiving e.g. data over a LAN. You would also have seen that I have already suggested two different simple fixes.
I could understand not wanting to put a lot of effort into correcting obscure or difficult-to-find performance problems that only a few people care about, but the obvious thing to do in this case is simply to change the default behavior.
It can and will be fixed. I just don't have the time to fix it myself. The functionality (of having timeouts work properly for streams created by socket.makefile()) is useful to have. --Guido van Rossum (home page: http://www.python.org/~guido/)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday, April 28, 2003, at 04:02 PM, Guido van Rossum wrote:
Why is the Python development team introducing bugs into Python and then expecting the user community to fix things that used to work?
I resent your rhetoric, Glyph. Had you read the rest of this thread, you would have seen that the performance regression only happens for sending data at maximum speed over the loopback device, and is negligeable when receiving e.g. data over a LAN. You would also have seen that I have already suggested two different simple fixes.
I apologize. I did not seriously mean this as an indictment of the entire Python development team or process. I would have responded to this effect sooner, but I've been swamped with work.
I could understand not wanting to put a lot of effort into correcting obscure or difficult-to-find performance problems that only a few people care about, but the obvious thing to do in this case is simply to change the default behavior.
It can and will be fixed. I just don't have the time to fix it myself.
I noticed your comment about the checkin. Thanks to the dev team for fixing it so promptly.
I think this should be in the release notes for 2.3. "Python is 10% faster, unless you use sockets, in which case it is much, much slower. Do the following in order to regain lost performance and retain the same semantics:"
That is total bullshit, Glyph, and you know it.
Please pardon the exaggeration. I forget that sarcasm does not come across as well on e-mail as it does on IRC. I appreciate that the performance drop wasn't really that serious. On a more positive note, looking at performance numbers got us thinking about increasing performance in Twisted. Anthony Baxter has been very helpful with profiling information, Itamar's already written some benchmarking tests, and I finished up a logging infrastructure that is more amenable to metrics gathering last night. (It's also less completely awful than the one we had before and should hook up to the new logging.py gracefully.) We already have an always-on multi-platform regression test suite for Twisted (not the snake farm): http://www.twistedmatrix.com/users/warner.twistd/ If we get this reporting some performance numbers as well, it would be pretty easy to turn it into a regression/performance test for Python by tweaking a few variables -- probably, just 'cvs update; make' in the Python directory instead of the Twisted one. Is there interest in seeing these kinds of numbers generated regularly? What kind of numbers would be interesting on the Python side? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (Darwin) iD8DBQE+sUSIvVGR4uSOE2wRAmJDAJ9dRfcX8zPYUvExUtvpxTpQlg2GhwCfde5B C7bsGc8YSwp5aN1vJ6BSiGU= =/c5y -----END PGP SIGNATURE-----
Apologies accepted, Glyph. --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (2)
-
Glyph Lefkowitz
-
Guido van Rossum