[Python-Dev] Re: Python-Dev digest, Vol 1 #3221 - 4 msgs
Glyph Lefkowitz
glyph@twistedmatrix.com
Mon, 28 Apr 2003 15:49:27 -0500
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Monday, April 28, 2003, at 11:00 AM, python-dev-request@python.org
wrote:
> Itamar> If this slowdown is confirmed, it is really not acceptable,
> Itamar> since the change seems to have been made only to support
> making
> Itamar> timeout sockets slightly easier to use.
>
> It was done to support making timeout sockets work properly. As they
> existed previously, timeout sockets wouldn't work with protocols which
> would
> most likely use them: higher level modules such as httplib, which call
> sock.makefile(), then call readlines?() on the resulting file object.
Clearly this is a flaw in httplib's design. Perhaps one should be able
to pass in a socket or file factory? That would allow speaking HTTP
over non-TCP transports or through something like a SOCKS proxy, which
is arguably a good thing. Do you want to add SOCKS support by adding
another wrapper around the socket module as well? How about a python
software firewall? Pretty soon our "correct" socket module will have
20 performance-destroying wrappers around it in order to work around
deficiencies in the interfaces of some programs which use sockets.
httplib is importing a module where passing a factory function is the
correct thing to do. At first it looks like you can parameterize it by
hacking up a module, but you can only do that once or twice before the
design problem really becomes pressing.
The socket module is not a high-level interface to networking.
Attempting to make it into one will harm its utility as a low-level
interface that good high-level interfaces can be built on top of.
> Itamar> Why should everyone have to pay a speed penalty just so a
> Itamar> minority of people can skip calling a
> Itamar> "socket.installtimeoutsupport()" at the beginning of their
> Itamar> program? it's just one line of code they'd need to add.
>
> I think it would be easier for the minority of programs that care
> about the
> 20% performance loss to simply set
I think this should be in the release notes for 2.3. "Python is 10%
faster, unless you use sockets, in which case it is much, much slower.
Do the following in order to regain lost performance and retain the
same semantics:"
I anticipate that more than just Twisted will want to monkey-patch the
module. (A 20% drop in throughput is a significant issue to more than
an eclectic audience.) If you're not going to fix this bug, maybe we
could have a "socket.monkeypatch()" method which would prevent
different systems from stepping on each other when they do it?
> I don't know about you, but fast and incorrect don't help me much.
Since when is the behavior of the socket module "incorrect"? If
anything the interface to "timeout sockets" is incorrect, because BSD
sockets do not in fact support timeouts. The interface is doing a
bunch of things behind the user's back which would be better done
another way, for example, with actually asynchronous networking. It's
pretty likely that there is some obscure corner-case that the select()
in timeout sockets doesn't catch.
From a brief glance, internal_select ignores error return values, and
nothing checks its errno before making another socket call. If I
remember correctly, that means that if select gets an EINTR, the
following call to accept() or recv() or what-have-you may very well
block. Of course, since the socket is in non-blocking mode at this
point, that means that Python will raise an exception on the EAGAIN
EWOULDBLOCK error. This is pretty hard to write a test for.
I could be wrong about this particular error, but in general if one
wishes to be pedantic about "correctness", one must first check the
result codes from one's C system calls.
> Feel free to submit a patch which improves performance but maintains
> proper behavior in the face of timeouts (that is, allows
> test_urllibnet to still work correctly).
Why is the Python development team introducing bugs into Python and
then expecting the user community to fix things that used to work? I
could understand not wanting to put a lot of effort into correcting
obscure or difficult-to-find performance problems that only a few
people care about, but the obvious thing to do in this case is simply
to change the default behavior.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (Darwin)
iD8DBQE+rZPbvVGR4uSOE2wRAhZVAKCjWkl1NSr8bC1DGcbvhKwL4GZ9+ACeO2cJ
FNU17XosCZxRTVRF/wIkLys=
=GJ3H
-----END PGP SIGNATURE-----