Mailman 3 Adding timeout to socket.py and httplib.py - Python-Dev

Adding timeout to socket.py and httplib.py

Facundo Batista

8 Mar 2007 8 Mar '07

4:44 p.m.

I studied Skip patch, and I think he is in good direction: add a NetworkConnection object to socket.py, and then use it from the other modules. This NetworkConnection basically does what all the other modules do once and again, so no mistery about it (basically calls getaddrinfo over the received address, and try to open a socket to that address). I opened a new patch (#1676823) with the changes I propose regarding socket.py, because the goal from the two patches are different (my plan is go with the basic: first the change in socket.py and httplib, and no in all the other modules at this time). I do not know what to do with the previous patch (#723312), I guess it'll remain open until all the other modules get the timeout. Here're the differences between Skip patch and mine: - I only left changes regarding httplib and socket modules (both .py, docs, and NEWS). - I even removed a change in Python-ast.c (regarding __version__), but I don't know what's that for, so please enlighten me (thank you). - The NetworkConnection won't have a ``get_family`` method, if you need the family of the open socket, just ask the socket. - Added some test cases to test_socket.py regarding attributes, timeout and family; and a nice threaded test to actually try the timeout. - Added tests cases to test_httplib.py Feel free to review the patch, and commit it if you want (or tell me to do it after the review, it's just a command for me). Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Show replies by date

Facundo Batista

15 Mar 15 Mar

2:42 p.m.

Facundo Batista wrote:

...

I studied Skip patch, and I think he is in good direction: add a NetworkConnection object to socket.py, and then use it from the other modules.

As of discussion in the patch tracker, this class is now a function in socket.py. This function connect() does the connection to an address, and can receive, optionally, a timeout.

...

I opened a new patch (#1676823) with the changes I propose regarding socket.py, because the goal from the two patches are different (my plan

The timeout.diff in this patch was updated to reflect these changes. If nobody raises objections, I'll commit these changes. Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Guido van Rossum

3:25 p.m.

I need to shed load; I've asked Georg to review this. If he's fine with it, Facundo can check it in. On 3/15/07, Facundo Batista wrote:

...

Facundo Batista wrote:

...
I studied Skip patch, and I think he is in good direction: add a NetworkConnection object to socket.py, and then use it from the other modules.

As of discussion in the patch tracker, this class is now a function in socket.py.

This function connect() does the connection to an address, and can receive, optionally, a timeout.

...
I opened a new patch (#1676823) with the changes I propose regarding socket.py, because the goal from the two patches are different (my plan

The timeout.diff in this patch was updated to reflect these changes.

If nobody raises objections, I'll commit these changes.

Regards,

-- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Georg Brandl

4:59 p.m.

I'll review it tomorrow. Georg Guido van Rossum schrieb:

...

I need to shed load; I've asked Georg to review this. If he's fine with it, Facundo can check it in.

On 3/15/07, Facundo Batista wrote:

...
Facundo Batista wrote:

...
I studied Skip patch, and I think he is in good direction: add a NetworkConnection object to socket.py, and then use it from the other modules.

As of discussion in the patch tracker, this class is now a function in socket.py.

This function connect() does the connection to an address, and can receive, optionally, a timeout.

...
I opened a new patch (#1676823) with the changes I propose regarding socket.py, because the goal from the two patches are different (my plan

The timeout.diff in this patch was updated to reflect these changes.

If nobody raises objections, I'll commit these changes.

Regards,

-- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

Facundo Batista

20 Mar 20 Mar

9:22 a.m.

On March 15, Georg Brandl wrote:

...

I'll review it tomorrow.

Do you have any news about this? Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Alan Kennedy

10:08 a.m.

[Facundo Batista]

...

Do you have any news about this?

Re: Patch 1676823 http://sourceforge.net/tracker/index.php?func=detail&aid=1676823&group_id=5470&atid=305470 Since I've just written a lot of socket stuff for jython, I thought I'd have a look at the patch. I like the idea of adding better socket control to the higher-level modules like httplib, etc, because these modules don't provide access to the underlying sockets, and using the socket module methods setdefaulttimeout, etc, is a little messy. I see that your updated socket.connect() method takes a timeout parameter, which defaults to None if not present, e.g. def connect(address, timeout=None): Later in the function, this line appears if timeout is not None: sock.settimeout(timeout) The problem with this is that None has a meaning as a timeout value; it means "put this socket in blocking mode". But that value can no longer be used for socket connects, since that value is being interpreted as "parameter was not provided". So, if a non-standard timeout has been set, using something like import socket ; socket.setdefaulttimeout(10.0) how do I restore full blocking behaviour to a single socket? (a somewhat contrived case, I admit). If I have access to the socket object, then I can call "sock_obj.settimeout(None)", but in that case I don't need the new API. I could also do it with the call "sock_obj.setblocking(1)". If I don't have access to the socket object, i.e. I'm using timeouts indirectly through httplib/etc, then I'm stuck: there's no way I can change the blocking or timeout behaviour; back to square one. So the new proposed API partly addresses the problem of increasing control over the underlying socket, but doesn't address all cases. It specifically prevents setting a timeout value of None on a socket, which is an important use case, I think. Regards, Alan.

Facundo Batista

10:45 a.m.

Alan Kennedy wrote:

...

I see that your updated socket.connect() method takes a timeout parameter, which defaults to None if not present, e.g.

I did NOT update a connect() method. I created a connect() function, in the module socket.py (there's also a connect() method in the socket object, but I didn't touch it).

...

import socket ; socket.setdefaulttimeout(10.0)

how do I restore full blocking behaviour to a single socket? (a somewhat contrived case, I admit).

You can not, unless you have access to the socket object itself.

...

If I have access to the socket object, then I can call "sock_obj.settimeout(None)", but in that case I don't need the new API. I could also do it with the call "sock_obj.setblocking(1)".

Exactly.

...

If I don't have access to the socket object, i.e. I'm using timeouts indirectly through httplib/etc, then I'm stuck: there's no way I can change the blocking or timeout behaviour; back to square one.

No. This method is for easily do that job from higher level libraries. The code that is in my patch, it's right now copied N times in higher level libraries (httplib, ftplib, smtplib, etc). In all those libraries, the socket is opened, used, and never changed the state between non-blocking, timeout, and nothing. Experience (personal and complains in mailing lists) shows that a timeout is needed: a lot of times people wants to make urllib2.urlopen(....., timeout=10), for example. But never heard of anybody wanting to "go to timeout" and then "back to blocking mode", with the same socket, using high level libraries.

...

So the new proposed API partly addresses the problem of increasing control over the underlying socket, but doesn't address all cases. It specifically prevents setting a timeout value of None on a socket, which is an important use case, I think.

False. If you want to set a timeout value of None on a socket, you surely can, I haven't touch any line of code in socket-the-object! Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Alan Kennedy

11:53 a.m.

[Alan Kennedy]

...

...
I see that your updated socket.connect() method takes a timeout parameter, which defaults to None if not present, e.g.

[Facundo Batista]

...

I did NOT update a connect() method. I created a connect() function, in the module socket.py (there's also a connect() method in the socket object, but I didn't touch it).

Sorry, my mistake. I realise now that you're creating a whole new function, dedicated to the special (but extremely common) case of creating a fully connected client socket. My fault for not realising that first off. So, a question I would ask is: Is "connect" the right name for that function? - Will users get confused between the "connect" function and socket.connect method? They are doing different things. - Will the naming give rise to the question "the socket-module-level function connect() takes a timeout parameter, why doesn't the socket-method connect() take a timeout parameter as well?" Perhaps a better name might be "create_connected_client_socket", or something equally descriptive? Another question I would ask is: "How do I ensure that my newly created connected client socket is in blocking mode, *without* making any assumptions about the value of socket.getdefaulttimeout()?" If the answer to this question is "you can't", then I would suggest a function signature and implementation like this instead def connect(address, **kwargs): [snip] if kwargs.has_key('timeout'): sock.settimeout(kwargs['timeout']) [snip] This would of course mean that the user would have to explicitly name the 'timeout' parameter, but that's a good thing in this case, IMO. Regards, Alan.

Facundo Batista

12:55 p.m.

Alan Kennedy wrote:

...

Sorry, my mistake.

No problem.

...

So, a question I would ask is: Is "connect" the right name for that function? ... Perhaps a better name might be "create_connected_client_socket", or something equally descriptive?

Guido proposed "connect_with_timeout". I don't like your proposal, neither Guido's. But, I recognize that maybe it's not the best name. What about "create_connection"?

...

Another question I would ask is: "How do I ensure that my newly created connected client socket is in blocking mode, *without* making any assumptions about the value of socket.getdefaulttimeout()?"

Call like this: newsock = socket.connect((..., ...)) newsock.setblocking(1) Remember that this function is to replace the same code in N other places, and in any of other places I saw this usage. Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Alan Kennedy

1:17 p.m.

[Facundo]

...

But, I recognize that maybe it's [connect] not the best name. What about "create_connection"?

I have no strong feelings about it, other than to say it should not be "connect". How about * connect_to_server() * open_connection() * open_client_connection() There's no need to include "timeout" in the name, IMO. [Alan]

...

...
Another question I would ask is: "How do I ensure that my newly created connected client socket is in blocking mode, *without* making any assumptions about the value of socket.getdefaulttimeout()?"

[Facundo]

...

Call like this:

newsock = socket.connect((..., ...)) newsock.setblocking(1)

Ah, but it's too late by the time the socket.connect call returns: the timeout/blocking behaviour of the socket.connect call is the very thing we're trying to control. Whenever I look at the proposed API, I think: What happens when the socket.connect call is preceded by a call which changes the default socket timeout/blocking behaviour, e.g. socket.setdefaulttimeout(1) newsock = socket.connect(HOST, PORT, None) # <-- None param ignored newsock.setblocking(1) # <-- This does not affect the behaviour of the connect I.E. I do not get the blocking behaviour I want. The proposed API does not permit me to get blocking behaviour by specifying a timeout value of None. Whereas with the slightly modified API I suggested earlier, it simply becomes socket.setdefaulttimeout(1) newsock = socket.connect(HOST, PORT, timeout=None) # newsock.setblocking(1) # <-- No longer relevant Regards, Alan.

Facundo Batista

3:24 p.m.

Alan Kennedy wrote:

...

[Facundo]

...
But, I recognize that maybe it's [connect] not the best name. What about "create_connection"?

I have no strong feelings about it, other than to say it should not be "connect". How about

Ok. "create_connection", then.

...

Ah, but it's too late by the time the socket.connect call returns: the timeout/blocking behaviour of the socket.connect call is the very thing we're trying to control.

It's not the very thing, just one of them... whatever, you have a point.

...

Whereas with the slightly modified API I suggested earlier, it simply becomes

I'm OK with that API, except that you're losing position parameters. It's OK to *always* need to put the "timeout="? The problem here is that I used None to check if you passed a parameter or not, an idiom well stablished in Python, but in this very case None has a meaning for itself. I'm +0 on having the obligation to a named parameter here. So, I have two modifications to make to the patch: - change the name to "create_connection" - make timeout obligatory named Is everybody ok with this? Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Steven Bethard

4:17 p.m.

On 3/20/07, Facundo Batista wrote:

...

So, I have two modifications to make to the patch:

- change the name to "create_connection" - make timeout obligatory named

Is everybody ok with this?

FWLIW, +1. It was not immediately apparent to me that the third argument in:: newsock = socket.create_connection(HOST, PORT, None) is supposed to be a timeout. The modified version:: newsock = socket.create_connection(HOST, PORT, timeout=None) is much clearer to me. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Facundo Batista

4:37 p.m.

Steven Bethard wrote:

...

is supposed to be a timeout. The modified version::

newsock = socket.create_connection(HOST, PORT, timeout=None)

Warning. The correct function signature is create_connection(address[, timeout=None]) where address is a tuple (HOST, PORT). BTW, how can I indicate in the tex file (docs), that the parameter, if present, is mandatorily named? Thanks! -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Josiah Carlson

4:40 p.m.

"Steven Bethard" wrote:

...

On 3/20/07, Facundo Batista wrote:

...
So, I have two modifications to make to the patch:

- change the name to "create_connection" - make timeout obligatory named

Is everybody ok with this?

FWLIW, +1. It was not immediately apparent to me that the third argument in::

newsock = socket.create_connection(HOST, PORT, None)

is supposed to be a timeout. The modified version::

newsock = socket.create_connection(HOST, PORT, timeout=None)

is much clearer to me.

Agreed, though implementation-wise, there is another technique for determining whether the use provided an argument to timeout or not... sentinel = object() def connect(HOST, PORT, timeout=sentinel): ... if timeout is not sentinel: sock.settimeout(timeout) ... A keyword argument via **kwargs is also fine. I have no preference. - Josiah

Facundo Batista

4:41 p.m.

Josiah Carlson wrote:

...

sentinel = object()

def connect(HOST, PORT, timeout=sentinel): ... if timeout is not sentinel: sock.settimeout(timeout) ...

A keyword argument via **kwargs is also fine. I have no preference.

I do. The way you showed here, I'm not restricting user options. I think this is better. Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Josiah Carlson

5:16 p.m.

Facundo Batista wrote:

...

Josiah Carlson wrote:

...
sentinel = object()

def connect(HOST, PORT, timeout=sentinel): ... if timeout is not sentinel: sock.settimeout(timeout) ...

A keyword argument via **kwargs is also fine. I have no preference.

I do. The way you showed here, I'm not restricting user options. I think this is better.

But the kwargs doesn't restrict options either... def connect(address, **kwargs): ... if 'timeout' in kwargs: sock.settimeout(kwargs['timeout']) ... With that method you can include timeout=None, and it also doesn't restrict what the user could pass as a value to timeout. It requires that you pass timeout explicitly, but that's a (relatively inconsequential) API decision. - Josiah

Facundo Batista

5:20 p.m.

Josiah Carlson wrote:

...

restrict what the user could pass as a value to timeout. It requires that you pass timeout explicitly, but that's a (relatively inconsequential) API decision.

This is exactly the point. It's an API decision, that you must communicate to the user, he/she must read it and remember it. Letting "timeout" be positional or named, it's just less error prone. So, if I can make it this way, it's what I prefer, :) Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Alan Kennedy

6:25 p.m.

[Facundo]

...

Letting "timeout" be positional or named, it's just less error prone. So, if I can make it this way, it's what I prefer, :)

So, if I want a timeout of, say, 80 seconds, I issue a call like this new_socket = socket.create_connection(address, 80) So is that address = host, port = 80? Or is it address = (host, port), timeout=80? I know *we* know what it is, but will the user? I prefer explicit naming of the timeout parameter. Regards, Alan.

Josiah Carlson

7:10 p.m.

"Alan Kennedy" wrote:

...

[Facundo]

...
Letting "timeout" be positional or named, it's just less error prone. So, if I can make it this way, it's what I prefer, :)

So, if I want a timeout of, say, 80 seconds, I issue a call like this

new_socket = socket.create_connection(address, 80)

So is that address = host, port = 80?

Or is it address = (host, port), timeout=80?

I know *we* know what it is, but will the user?

I prefer explicit naming of the timeout parameter.

Error-wise, I agree that it would be better to pass timeout explicitly with a keyword, but generally users will notice their mistake if they try to do create_connection(host, port) by ValueError("tuple expected as first argument, got str instead") Is it better than TypeError("create_connection takes 1 argument (2 given)") ? - Josiah

Alan Kennedy

8:27 p.m.

[Josiah]

...

Error-wise, I agree that it would be better to pass timeout explicitly with a keyword, but generally users will notice their mistake if they try to do create_connection(host, port) by ValueError("tuple expected as first argument, got str instead") Is it better than TypeError("create_connection takes 1 argument (2 given)") ?

Yes, it is better. Currently, the socket.connect() method takes a tuple, and fails with the following exception if 2 separate parameters are passed TypeError: connect() takes exactly one argument (2 given) Which is fine because the function does take exactly one argument. But we're discussing a function with an optional timeout parameter, so that TypeError wouldn't be raised if I called create_connection("localhost", 80). The patch as it currently is, if I am reading it right, would raise one of the following if a string was passed as the address argument, depending on the length of the string. ValueError: need more than 1 value to unpack # len(address) == 1 ValueError: too many values to unpack # len(address) > 2 since it extracts the host and port like so host, port = address Which succeeds, somewhat surprisingly, if a string is passed that is 2 characters long. I was a little surprised to find that this didn't give rise to an error: host, port = "ab". So with a two character hostname, the second letter would be unpacked as a port number. And the function would then fail with the following exception when it reaches the getaddrinfo ("a", "b", 0, SOCK_STREAM) call. socket.gaierror: (10109, 'getaddrinfo failed') I suggest updating the patch to - Explicitly check that the address passed is a tuple of (string, integer) - To raise an exception explaining the parameter expectation when it is not met - To require that the user explicitly name the timeout parameter Regards, Alan.

Facundo Batista

11:27 p.m.

Alan Kennedy wrote:

...

- Explicitly check that the address passed is a tuple of (string, integer)

It's more probable that a use pass a list of two values, that a host of two letters as you suggested above...

...

- To raise an exception explaining the parameter expectation when it is not met

Won't be necessary if we take into account the explicit timeout parameter...

...

- To require that the user explicitly name the timeout parameter

I already agreed on this, :) So, as a resume: - I'll make "timeout" mandatory - The function signature will be: create_connection(address[, timeout]) See that timeout doesn't have a default value, if you include it, it'll set the socket timeout, if you don't, the defaultimeout will work. The address is a tuple (host, port), as usual In the code, I'll just make "host, port = address", I don't think it will be a problem at all. Remember that this function primary use is for higher level libraries, and that "address" in socket enviroment is always, always, (host, port). Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Alan Kennedy

22 Mar 22 Mar

9:22 a.m.

[Alan]

...

...
- Explicitly check that the address passed is a tuple of (string, integer)

[Facundo]

...

In the code, I'll just make "host, port = address", I don't think it will be a problem at all. Remember that this function primary use is for higher level libraries, and that "address" in socket enviroment is always, always, (host, port).

It's rather unfortunate that the tuple needs to be unpacked at all. Instead, it should be possible to simply pass the address tuple directly to the socsket.getaddrinfo() function, and let it worry about the tuple-ness of the address, raising exceptions accordingly. The socket.getaddrinfo() function, unlike every other python socket function, takes separate host and port parameters. Which forces every user of the socket.getaddrinfo function to do the same unnecessary and potentially error-prone address tuple unpacking. I have raised a feature request to change this. [1685962] socket.getaddrinfo() should take an address tuple. Regards, Alan.

Guido van Rossum

10:41 a.m.

On 3/22/07, Alan Kennedy wrote:

...

[Alan]

...
...
- Explicitly check that the address passed is a tuple of (string, integer)

[Facundo]

...
In the code, I'll just make "host, port = address", I don't think it will be a problem at all. Remember that this function primary use is for higher level libraries, and that "address" in socket enviroment is always, always, (host, port).

It's rather unfortunate that the tuple needs to be unpacked at all.

Why?

...

Instead, it should be possible to simply pass the address tuple directly to the socsket.getaddrinfo() function, and let it worry about the tuple-ness of the address, raising exceptions accordingly.

The socket.getaddrinfo() function, unlike every other python socket function, takes separate host and port parameters. Which forces every user of the socket.getaddrinfo function to do the same unnecessary and potentially error-prone address tuple unpacking.

I have raised a feature request to change this.

[1685962] socket.getaddrinfo() should take an address tuple.

It's unlikely to be granted. Getaddrinfo(), like gethostname() and a few other things, lives at a different abstraction level than the basic socket object; it is only relevant for IP sockets, not for other types of addresses. The Python call just wraps the system call which has a similar API. While from a purist POV you might want to move all IP-related APIs out of the "pure" socket module (and this would include SSL), in practice, nobody cares. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Ewing

9:07 p.m.

Guido van Rossum wrote:

...

It's unlikely to be granted. ... The Python call just wraps the system call which has a similar API.

What about letting it accept both? Maintaining strict consistency with the C API here at the cost of causing pain on almost all uses of the function seems to be a case of purity beating practicality. -- Greg

Facundo Batista

20 Mar 20 Mar

11:20 p.m.

Alan Kennedy wrote:

...

So is that address = host, port = 80?

Or is it address = (host, port), timeout=80?

The latter, as is in the rest of Python... See your point, you say it's less error prone to make timeout mandatory. I really don't care, so I'll take your advice... -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Alan Kennedy

5:43 p.m.

[Facundo]

...

So, I have two modifications to make to the patch:

- change the name to "create_connection" - make timeout obligatory named

I was going to suggest a third change: for orthogonality with the API for socket objects, add a blocking parameter as well, i.e. def create_connection(address, timeout=sentinel, blocking=sentinel): [snip] if timeout != sentinel: new_socket.settimeout(timeout) if blocking != sentinel: new_socket.setblocking(blocking) [snip] but that may be taking it too far. But there is still an issue remaining, relating to non-blocking IO. With or without a blocking parameter, the user can still set non-blocking behaviour on a socket by setting a timeout of 0. The following snippet illustrates the issue. #-=-=-=-=-=-=-=-=-=-=-=-=-= import socket s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.settimeout(0) s.connect( ("localhost", 80) ) #-=-=-=-=-=-=-=-=-=-=-=-=-= If you run this, it is very likely to generate an exception, but not guaranteed: you may have to run it a few times. Or try a host that is slower to respond. The problem is now that the connect call is now a non-blocking connect, which means that it may throw a socket.error, even after a successful connect, as follows socket.error: (10035, 'The socket operation could not complete without blocking') The standard mechanism in C for doing a non-blocking connect is to issue the connect call, and check the return value for a non-zero error code. If this error code is errno.EAGAIN (code 10035), then the call succeeded, but you should check back later for completion of the operation. It was for this reason that the connect_ex method was introduced to python socket objects. Instead of raising an exception, it directly returns the error code from the socket operation, so that it can be checked, as in C. So in the case of the new create_connection function, either A: The user should be prepared to handle an exception if they use a zero timeout (i.e. set non-blocking mode) or B: Detect the non-blocking case inside the function implementation and return the value of the connect_ex method instead of the connect method, as would be standard in a non-blocking app. This could be implemented as follows def create_connection(address, timeout=sentinel): [snip] if timeout != sentinel: new_socket.settimeout(timeout) if new_socket.gettimeout() == 0: result = new_socket.connect_ex(address) else: new_socket.connect(address) result = new_socket [snip] I know that this makes it all more complex, and I'm *not* saying the new function should be modified to include these concerns. The new function is designed to address straightforward usability cases, so it's perhaps appropriate that the API be restricted to those cases, i.e. to supporting timeout values > 0. Perhaps this limit could be coded into the function? Also, people who want explicitly do non-blocking socket IO will likely use the socket API directly, so it may not be worth supporting that use in this function. Regards, Alan.

Josiah Carlson

6:11 p.m.

"Alan Kennedy" wrote: [snip]

...

def create_connection(address, timeout=sentinel): [snip] if timeout != sentinel: new_socket.settimeout(timeout) if new_socket.gettimeout() == 0: result = new_socket.connect_ex(address) else: new_socket.connect(address) result = new_socket [snip]

I know that this makes it all more complex, and I'm *not* saying the new function should be modified to include these concerns. [snip]

But now the result could be either an error code OR a socket. One of the reasons to provide a timeout for the create_connection call, if I understand correctly, is to handle cases for which you don't get a response back in sufficient time. If the user provides zero as a timeout, then they may very well get an exception, which is what they should expect. Then again, even with an arbitrary timeout, an exception is possible (especially if a host is down, etc.), and hiding the exceptional condition (didn't connect in the allotted time) is a bad thing. - Josiah

Alan Kennedy

6:21 p.m.

[Josiah]

...

But now the result could be either an error code OR a socket. One of the reasons to provide a timeout for the create_connection call, if I understand correctly, is to handle cases for which you don't get a response back in sufficient time.

AFAICT, that's the only reason. It's not to handle blocking sockets, that's the default operation of sockets. And it's not to handle non-blocking sockets either.

...

If the user provides zero as a timeout, then they may very well get an exception, which is what they should expect.

Yes, they should expect it. And they would handle it like this try: new_socket = socket.create_connection(address, 0): except socket.error: import errno: if errno.errno == 10035 # or relevant platform specific symbolic constant # socket is still connecting else: # there was a real socket error

...

Then again, even with an arbitrary timeout, an exception is possible (especially if a host is down, etc.), and hiding the exceptional condition (didn't connect in the allotted time) is a bad thing.

See above. Regards, Alan.

Greg Ewing

6:51 p.m.

Alan Kennedy wrote:

...

The standard mechanism in C for doing a non-blocking connect is to issue the connect call, and check the return value for a non-zero error code. If this error code is errno.EAGAIN (code 10035), then the call succeeded, but you should check back later for completion of the operation.

Hmmm. I think that this case probably isn't what people will have in mind when they specify a timeout for connecting. More likely they mean "If the connection couldn't be successfully established within this time, give up and let me know." So it seems to me that a return value of EAGAIN should be handled internally by re-issuing the connect call with a suitably reduced timeout value. If the timeout gets down to zero without a successful result, throw an exception. An application that wants to do fully asynchronous connects will have to take quite a different approach, so there should probably be a different API for this. -- Greg

Josiah Carlson

7:20 p.m.

Greg Ewing wrote:

...

Alan Kennedy wrote:

...
The standard mechanism in C for doing a non-blocking connect is to issue the connect call, and check the return value for a non-zero error code. If this error code is errno.EAGAIN (code 10035), then the call succeeded, but you should check back later for completion of the operation.

An application that wants to do fully asynchronous connects will have to take quite a different approach, so there should probably be a different API for this.

*cough* asyncore or twisted *cough* Sorry, what were we talking about again? Oh yeah, timeouts. From what I understand of connect and connect_ex, if a socket has a specified timeout, it is supposed to "try" (it only attempts once, and waits for a response) until it either fails (because the other end won't accept), or it times out. Either case is perfectly fine, and I don't really see the point of retrying (in socket.create_connection). - Josiah

Greg Ewing

6:18 p.m.

Alan Kennedy wrote:

...

def connect(address, **kwargs): [snip] if kwargs.has_key('timeout'): sock.settimeout(kwargs['timeout']) [snip]

A problem with interfaces like this is that it makes it awkward to pass on a value that you received from higher up. An alternative would be to create and publish a different special value to mean "no timeout". -- Greg

6241

Age (days ago)

6256

Last active (days ago)

List overview

Download

30 comments

7 participants

participants (7)

Alan Kennedy
Facundo Batista
Georg Brandl
Greg Ewing
Guido van Rossum
Josiah Carlson
Steven Bethard

Adding timeout to socket.py and httplib.py

Alan Kennedy

Alan Kennedy

Alan Kennedy

Josiah Carlson

Josiah Carlson

Alan Kennedy

Josiah Carlson

Alan Kennedy

Alan Kennedy

Alan Kennedy

Josiah Carlson

Alan Kennedy

Josiah Carlson

tags

participants (7)