Mailman 3 Socket timeout patch - Python-Dev

Socket timeout patch

Michael Gilfix

May 12, 2002

7:27 a.m.

The socket timeout patch is finally available with doc updates n' unit test as patch #555085 in the SF tracker. This implementation provides timeout functionality at the C level. A patch to socket.py also fixes the problem of losing data while an exception is thrown to the underlying socket. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Show replies by date

Guido van Rossum

May 2002

3:13 p.m.

...

Hi Michael, Sorry for taking such a long time to review this. (Apparently nobody else had time either.) And thanks for doing this! I think this is a great start. Rather than typing them in the SF bug manager issue I am mailing them (I'll add a link to the archived version of this message to the SF bug manager to track the activity). General style nits: - You submitted a reverse diff! Customary is diff(old, new). - Please don't put a space between the function name and the open parenthesis. (You do this both in C and in Python code.) - Also please don't put a space between the open parenthesis and the first argument (you do this almost consistently in test_timeout.py). - Please don't introduce lines longer than 78 columns. Feedback on the patch to socket.py: - I think you're changing the semantics of unbuffered and line-buffered reads/writes on Windows. For one thing, you no longer implement line-buffered writes correctly. The idea is that if the buffer size is set to 1, data is flushed at \n only, so that if the code builds up the line using many small writes, this doesn't result in many small sends. There was code for this in write() -- why did you delete it? - It also looks like you've broken the semantics of size<0 in read(). - Maybe changing the write buffer to a list makes sense too? - Since this code appears independent from the timeoutsocket code, maybe we can discuss this separately? Feedback on the documentation: - I would document that the argument to settimeout() should be int or float (hm, can it be a long? that should be acceptable even if it's strange), and that the return value of gettimeout() is None or a float. - You may want to document the interaction with blocking mode. Feedback on the C socket module changes: - Why import the select module? It's much more efficient to use the select system call directly. You don't need all the overhead and generality that the select module adds on top of it, and it costs a lot (select allocates lots of objects and lots of memory and hence is very expensive). - <errno.h> is already included by Python.h. - Please don't introduce more static functions with a 'Py' name prefix. - You should raise TypeError if the type of the argument is wrong, and ValueError if the value is wrong (out of range). Not SocketError. - I believe that you can't reliably maintain a "sock_blocking" flag; there are setsockopt() or ioctl() calls that can make a socket blocking or non-blocking. Also, there's no way to know whether a socket passed to fromfd() is in blocking mode or not. - There are refcount bugs. I didn't do a detailed review of these, but I note that the return value from PyFloat_FromDouble() in PySocketSock_settimeout() is leaked. (There's an INCREF that's only needed for the None case.) - The function internal_select() is *always* used like this: count = internal_select (s, 1); if (count < 0) return NULL; else if (count == 0) /* Timeout elapsed */ return PyTimeout_Err (); If internal_select() called PyTimeout_Err() itself, all call sites could be simplified to this: count = internal_select (s, 1); if (count < 0) return NULL; or even (now that the count variable is no longer needed) to this: if(internal_select (s, 1) < 0) return NULL; - The accept() wrapper contains this bit of code (only showing the Unix variant): if (newfd < 0) if (!s->sock_blocking || (errno != EAGAIN && errno != EWOULDBLOCK)) return s->errorhandler (); Isn't the sense of testing s->sock_blocking wrong? I would think that if we're in blocking mode we'd want to return immediately without even checking the errno. I recommend writing this out more clearly, e.g. like this: if (s->sock_blocking) return s->errorhandler(); /* A non-blocking accept() failed */ if (errno != EAGAIN && errno != EWOULDBLOCK) return s->errorhandler(); - What is s->errorhandler() for? AFAICT, this is always equal to PySocket_Err! - The whole interaction between non-blocking mode and timeout mode is confusing to me. Are you sure that this always does the right thing? Have you even thought about what "the right thing" is in all 4 combinations? --Guido van Rossum (home page: http://www.python.org/~guido/)

Michael Gilfix

2:50 p.m.

Whew. Lots of stuff. Give me a while to address all these issues, rework the patch, and then submit a comprehensive answer. Should have an answer by the end of the week latest. -- Mike On Thu, May 23 @ 16:13, Guido van Rossum wrote:

...

Sorry for taking such a long time to review this. (Apparently nobody else had time either.) And thanks for doing this! I think this is a great start. Rather than typing them in the SF bug manager issue I am mailing them (I'll add a link to the archived version of this message to the SF bug manager to track the activity).

General style nits:

- You submitted a reverse diff! Customary is diff(old, new).

- Please don't put a space between the function name and the open parenthesis. (You do this both in C and in Python code.)

- Also please don't put a space between the open parenthesis and the first argument (you do this almost consistently in test_timeout.py).

- Please don't introduce lines longer than 78 columns.

Feedback on the patch to socket.py:

- I think you're changing the semantics of unbuffered and line-buffered reads/writes on Windows. For one thing, you no longer implement line-buffered writes correctly. The idea is that if the buffer size is set to 1, data is flushed at \n only, so that if the code builds up the line using many small writes, this doesn't result in many small sends. There was code for this in write() -- why did you delete it?

- It also looks like you've broken the semantics of size<0 in read().

- Maybe changing the write buffer to a list makes sense too?

- Since this code appears independent from the timeoutsocket code, maybe we can discuss this separately?

Feedback on the documentation:

- I would document that the argument to settimeout() should be int or float (hm, can it be a long? that should be acceptable even if it's strange), and that the return value of gettimeout() is None or a float.

- You may want to document the interaction with blocking mode.

Feedback on the C socket module changes:

- Why import the select module? It's much more efficient to use the select system call directly. You don't need all the overhead and generality that the select module adds on top of it, and it costs a lot (select allocates lots of objects and lots of memory and hence is very expensive).

- <errno.h> is already included by Python.h.

- Please don't introduce more static functions with a 'Py' name prefix.

- You should raise TypeError if the type of the argument is wrong, and ValueError if the value is wrong (out of range). Not SocketError.

- I believe that you can't reliably maintain a "sock_blocking" flag; there are setsockopt() or ioctl() calls that can make a socket blocking or non-blocking. Also, there's no way to know whether a socket passed to fromfd() is in blocking mode or not.

- There are refcount bugs. I didn't do a detailed review of these, but I note that the return value from PyFloat_FromDouble() in PySocketSock_settimeout() is leaked. (There's an INCREF that's only needed for the None case.)

- The function internal_select() is *always* used like this:

count = internal_select (s, 1); if (count < 0) return NULL; else if (count == 0) /* Timeout elapsed */ return PyTimeout_Err ();

If internal_select() called PyTimeout_Err() itself, all call sites could be simplified to this:

count = internal_select (s, 1); if (count < 0) return NULL;

or even (now that the count variable is no longer needed) to this:

if(internal_select (s, 1) < 0) return NULL;

- The accept() wrapper contains this bit of code (only showing the Unix variant):

if (newfd < 0) if (!s->sock_blocking || (errno != EAGAIN && errno != EWOULDBLOCK)) return s->errorhandler ();

Isn't the sense of testing s->sock_blocking wrong? I would think that if we're in blocking mode we'd want to return immediately without even checking the errno. I recommend writing this out more clearly, e.g. like this:

if (s->sock_blocking) return s->errorhandler(); /* A non-blocking accept() failed */ if (errno != EAGAIN && errno != EWOULDBLOCK) return s->errorhandler();

- What is s->errorhandler() for? AFAICT, this is always equal to PySocket_Err!

- The whole interaction between non-blocking mode and timeout mode is confusing to me. Are you sure that this always does the right thing? Have you even thought about what "the right thing" is in all 4 combinations?

--Guido van Rossum (home page: http://www.python.org/~guido/) `-> (guido)

-- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Michael Gilfix

June 2002

10:22 a.m.

Alrighty. Here's the monster reply. I'll be much faster with the replies this week. Had a hectic week last week. On Thu, May 23 @ 16:13, Guido van Rossum wrote:

...

General style nits:

- You submitted a reverse diff! Customary is diff(old, new).

Oops. Will fix that this time round.

...

- Please don't put a space between the function name and the open parenthesis. (You do this both in C and in Python code.)

Fixed. Some personal preference bled it there. All removed in my copy.

...

- Also please don't put a space between the open parenthesis and the first argument (you do this almost consistently in test_timeout.py).

Couldn't really figure out what you were seeing here. I read that you saw something like func( a, b), which I don't see in my local copy. I do have something like this for list comprehension: [ x.find('\n') for x in self._rbuf ] Er, but I though there were supposed to be surrounding spaces at the edges...

...

- Please don't introduce lines longer than 78 columns.

Fixed my offending line. I've also corrected some other lines in the socket module that went over 78 columns (there were a few).

...

Feedback on the patch to socket.py:

- I think you're changing the semantics of unbuffered and line-buffered reads/writes on Windows. For one thing, you no longer implement line-buffered writes correctly. The idea is that if the buffer size is set to 1, data is flushed at \n only, so that if the code builds up the line using many small writes, this doesn't result in many small sends. There was code for this in write() -- why did you delete it?

I screwed up the write. New the write is: def write(self, data): self._wbuf = self._wbuf + data if self._wbufsize == 1: if '\n' in data: self.flush () elif len(self._wbuf) >= self._wbufsize: self.flush() which is pretty much the same as the old. The read should be ok though. I could really use someone with a win compiler to test this for me.

...

- It also looks like you've broken the semantics of size<0 in read().

Maybe I'm misunderstanding the code, but I thought that a size < 0 meant to read until there are no more? The statement: while size < 0 or buf_len < size: accomplishes the same thing as what's in the current socket.py implementation. If you look closely, the 'if >= 0' branch *always* returns, meaning that the < 0 is equiv to while 1. Due to shortcutting, the same thing happens in the above statement. Maybe a comment would make it clearer?

...

- Maybe changing the write buffer to a list makes sense too?

I could do this. Then just do a join before the flush. Is the append /that/ much faster?

...

- Since this code appears independent from the timeoutsocket code, maybe we can discuss this separately?

The point of this code was to keep from losing data when an exception occurs (as timothy, if I remember correctly, pointed out). Hence the reason for keeping a lot more data around in instance variables instead of local variables. So the windows version might (in obscure cases) be affected by the timeout changes. That's what this patch was addressing.

...

Feedback on the documentation:

- I would document that the argument to settimeout() should be int or float (hm, can it be a long? that should be acceptable even if it's strange), and that the return value of gettimeout() is None or a float.

It can be a long in my local copy. The argument can be any numeric value and the special None. I've updated my documentation to be more explicit.

...

- You may want to document the interaction with blocking mode.

I've put notes in the tex documentation. Here's how the interaction works: if the socket is in non-blocking mode: All operations are non-blocking and setting timeouts doesn't mean anything (they are not enforced). A timeout can still be changed and gettimeout will reflect the value but the exception will never be raised. else if the socket is in blocking mode: enabling timeouts does the usual thing you would expect from timeouts.

...

Feedback on the C socket module changes:

- Why import the select module? It's much more efficient to use the select system call directly. You don't need all the overhead and generality that the select module adds on top of it, and it costs a lot (select allocates lots of objects and lots of memory and hence is very expensive).

Well, the thinking was that if there were any portability issues with select, they could be taken care of in one place. At the time, I hadn't really looked closely at the select module. Now that I glance at it, pretty much all the code in the select module just extracts the necessary information from the objects for polling. I suppose I could just use select directly... There's also the advantage of all the error handling in select. I could do a stripped down version of the code, I suppose, for speed. Seemed like a good idea for code re-use.

...

- <errno.h> is already included by Python.h.

I didn't do this but it's been removed.

...

- Please don't introduce more static functions with a 'Py' name prefix.

Only did this in one place, with PyTimeout_Err. The reason was that the other Error functions used the Py prefix, so it was done for consistency. I can change that.. or change the naming scheme with the others if you like.

...

- You should raise TypeError if the type of the argument is wrong, and ValueError if the value is wrong (out of range). Not SocketError.

Oops. Fixed.

...

- I believe that you can't reliably maintain a "sock_blocking" flag; there are setsockopt() or ioctl() calls that can make a socket blocking or non-blocking. Also, there's no way to know whether a socket passed to fromfd() is in blocking mode or not.

Well, upon socket creation (in init_sockobject), the socket is set to blocking mode. I think that handles fromfd, right? Doesn't every initialization means have to call that function? The real problem would be someone using an ioctl or setsockopt (Can you even do blocking stuff through setsockopt?). Ugh. The original timeoutsocket didn't really deal with anything like that. Erm, seems like an interface problem here - using ioctl kinda breaks the socket object interface. Perhaps we should be doing some sort of getsockopt to figure out the blocking mode and update our state accordingly? That would be an extra call for each thing to the interface though. One solution is to set/unset blocking mode right before doing each call to be sure of the state and based on the internally stored value of the blocking attribute... but... then that kind of renders ioctl useless. Another solution might be to set the blocking mode to on everytime someone sets a timeout. That would change the blocking/socket interaction already described a bit but not drastically. Also easy to implement. That sends the message: Don't use ioctls when using timeouts. Hmm.. Will need to think about this more. Any insight would be helpful or some wisdom about how you usually handle this sort of thing.

...

- There are refcount bugs. I didn't do a detailed review of these, but I note that the return value from PyFloat_FromDouble() in PySocketSock_settimeout() is leaked. (There's an INCREF that's only needed for the None case.)

This has been fixed. I was one ref count too high in my scheme.

...

- The function internal_select() is *always* used like this:

count = internal_select (s, 1); if (count < 0) return NULL; else if (count == 0) /* Timeout elapsed */ return PyTimeout_Err ();

If internal_select() called PyTimeout_Err() itself, all call sites could be simplified to this:

count = internal_select (s, 1); if (count < 0) return NULL;

or even (now that the count variable is no longer needed) to this:

if(internal_select (s, 1) < 0) return NULL;

Good point. Except the return value needs to be checked for <= 0 in this case. Changes were made.

...

- The accept() wrapper contains this bit of code (only showing the Unix variant):

if (newfd < 0) if (!s->sock_blocking || (errno != EAGAIN && errno != EWOULDBLOCK)) return s->errorhandler ();

Isn't the sense of testing s->sock_blocking wrong? I would think that if we're in blocking mode we'd want to return immediately without even checking the errno. I recommend writing this out more clearly, e.g. like this:

if (s->sock_blocking) return s->errorhandler(); /* A non-blocking accept() failed */ if (errno != EAGAIN && errno != EWOULDBLOCK) return s->errorhandler();

I've written this out more explicitly as you suggest. It is supposed to be !s->sock_blocking though. If we're in non-blocking mode at that point with an error, then it's definitely an error. If we're in blocking mode, then we have to check the type of error. The reason being that the underlying socket is always in non-blocking mode (remember select) so we need to check that we don't have a weird error. I've written it out like this: if (newfd < 0) { ········if (!s->sock_blocking) ············return s->errorhandler(); ········/* Check if we have a true failure for a blocking socket */ ········if (errno != EAGAIN && errno != EWOULDBLOCK) ············return s->errorhandler(); ····} I've also fixed a similar thing for connect.

...

- What is s->errorhandler() for? AFAICT, this is always equal to PySocket_Err!

This was always in the module. Not sure why it was put there intially. I used it to be consistent.

...

- The whole interaction between non-blocking mode and timeout mode is confusing to me. Are you sure that this always does the right thing? Have you even thought about what "the right thing" is in all 4 combinations?

I think I've explained this earlier in the thread. Lemme know if I need any more clarifications. If you made it this far, it's time for coffee. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Guido van Rossum

12:22 p.m.

[Addressing only points that need attention]

...

...
- Also please don't put a space between the open parenthesis and the first argument (you do this almost consistently in test_timeout.py).

Couldn't really figure out what you were seeing here. I read that you saw something like func( a, b), which I don't see in my local copy.

test_timeout.py from the SF page has this. I'm glad you fixed this already in your own copy.

...

I do have something like this for list comprehension:

[ x.find('\n') for x in self._rbuf ]

Er, but I though there were supposed to be surrounding spaces at the edges...

I prefer to see that as [x.find('\n') for x in self._rbuf]

...

I screwed up the write. New the write is:

[...]

...

which is pretty much the same as the old. The read should be ok though. I could really use someone with a win compiler to test this for me.

I'll review it more when you next upload it.

...

...
- It also looks like you've broken the semantics of size<0 in read().

Maybe I'm misunderstanding the code, but I thought that a size < 0 meant to read until there are no more? The statement:

while size < 0 or buf_len < size:

accomplishes the same thing as what's in the current socket.py implementation. If you look closely, the 'if >= 0' branch *always* returns, meaning that the < 0 is equiv to while 1. Due to shortcutting, the same thing happens in the above statement. Maybe a comment would make it clearer?

I was referring to this piece of code: ! if buf_len > size: ! self._rbuf.append (data[size:]) ! data = data[:size] Here data[size:] gives you the last byte of the data and data[:size] chops off the last byte.

...

...
- Maybe changing the write buffer to a list makes sense too?

I could do this. Then just do a join before the flush. Is the append /that/ much faster?

Depends on how small the chunks are you write. Roughly, repeated list append is O(N log N), while repeated string append is O(N**2).

...

...
- Since this code appears independent from the timeoutsocket code, maybe we can discuss this separately?

The point of this code was to keep from losing data when an exception occurs (as timothy, if I remember correctly, pointed out). Hence the reason for keeping a lot more data around in instance variables instead of local variables. So the windows version might (in obscure cases) be affected by the timeout changes. That's what this patch was addressing.

OK, but given the issues the first version had, I recommand that the code gets more review and that you write unit tests for all cases.

...

...
- Please don't introduce more static functions with a 'Py' name prefix.

Only did this in one place, with PyTimeout_Err. The reason was that the other Error functions used the Py prefix, so it was done for consistency. I can change that.. or change the naming scheme with the others if you like.

I like to do code cleanup that doesn't change semantics (like renamings) as a separate patch and checkin. You can do this before or after the timeout changes, but don't merge it into the timeout changes. I still like the static names that you introduce not to start with Py.

...

...
- I believe that you can't reliably maintain a "sock_blocking" flag; there are setsockopt() or ioctl() calls that can make a socket blocking or non-blocking. Also, there's no way to know whether a socket passed to fromfd() is in blocking mode or not.

Well, upon socket creation (in init_sockobject), the socket is set to blocking mode. I think that handles fromfd, right? Doesn't every initialization means have to call that function?

OK, it looks like you call internal_setblocking(s, 0) to set the socket in nonblocking mode. (Hm, I don't see any calls to set the socket in blocking mode!) So do I understand that you are now always setting the socket in non-blocking mode, even when there is no timeout specified, and that you look at the sock_blocking flag to decide whether to do timeouts or just pass the nonblocking behavior to the user? This is a change in semantics, and could interfere with existing applications that pass the socket's file descriptor off to other code. I think I'd be happier if the behavior wasn't changed at all until a timeout is set for a socket -- then existing code won't break.

...

The real problem would be someone using an ioctl or setsockopt (Can you even do blocking stuff through setsockopt?).

Yes, setblocking() makes a call to setsockopt(). :-)

...

Ugh. The original timeoutsocket didn't really deal with anything like that. Erm, seems like an interface problem here - using ioctl kinda breaks the socket object interface. Perhaps we should be doing some sort of getsockopt to figure out the blocking mode and update our state accordingly? That would be an extra call for each thing to the interface though.

I only really care for sockets passed in to fromfd(). E.g. someone can currently do: s1 = socket(AF_INET, SOCK_STREAM) s1.setblocking(0) s2 = fromfd(s1.fileno()) # Now s2 is non-blocking too I'd like this to continue to work as long as s1 doesn't set a timeout.

...

One solution is to set/unset blocking mode right before doing each call to be sure of the state and based on the internally stored value of the blocking attribute... but... then that kind of renders ioctl useless.

Don't worry so much about ioctl, but do worry about fromfd.

...

Another solution might be to set the blocking mode to on everytime someone sets a timeout. That would change the blocking/socket interaction already described a bit but not drastically. Also easy to implement. That sends the message: Don't use ioctls when using timeouts.

I like this.

...

Hmm.. Will need to think about this more. Any insight would be helpful or some wisdom about how you usually handle this sort of thing.

See above. Since we don't know what people out there are doing, I don't want to break existing code. We do know that existing code doesn't use (this form of) timeout, so we can exploit that knowledge.

...

...
- What is s->errorhandler() for? AFAICT, this is always equal to PySocket_Err!

This was always in the module. Not sure why it was put there intially. I used it to be consistent.

Argh, you're right. MAL added this; I'll ask him why.

...

If you made it this far, it's time for coffee.

When can I expect a new version? --Guido van Rossum (home page: http://www.python.org/~guido/)

Tim Peters

12:32 p.m.

[Guido]

...

...
...
- Maybe changing the write buffer to a list makes sense too?

[mgilfix@eecs.tufts.edu]

...

...
I could do this. Then just do a join before the flush. Is the append /that/ much faster?

[Guido]

...

Depends on how small the chunks are you write. Roughly, repeated list append is O(N log N), while repeated string append is O(N**2).

Repeated list append is O(N) amortized (a single append may take O(N) time all by itself, but if you do N of them in a row the time is still no worse than O(N) overall; a possible conceptual difficulty may arise here because the value of "N" changes over time, and while growing to a total of size N may require O(log N) whole-list copies, each of the copies involves far fewer elements than the final value of N -- if you add up all these smaller values of N in the worst case, the sum is O(N) wrt the final value of N, and so it's worst-case O(N) overall wrt the final value of N).

Michael Gilfix

1:39 p.m.

On Mon, Jun 03 @ 13:22, Guido van Rossum wrote:

...

...
If you made it this far, it's time for coffee.

When can I expect a new version?

Give me a day or two to address these points and produce the new version. -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Michael Gilfix

2:52 p.m.

On Mon, Jun 03 @ 13:22, Guido van Rossum wrote:

...

...
Couldn't really figure out what you were seeing here. I read that you saw something like func( a, b), which I don't see in my local copy.

test_timeout.py from the SF page has this. I'm glad you fixed this already in your own copy.

Weird. I didn't change anything. Oh well. We'll see if it shows up in the new patch this time round.

...

...
I do have something like this for list comprehension:

[ x.find('\n') for x in self._rbuf ]

Er, but I though there were supposed to be surrounding spaces at the edges...

I prefer to see that as

[x.find('\n') for x in self._rbuf]

Ok. Done. One day, you can explain to me why you despise whitespace so. Perhaps she was mean to you or something. She's always hanging around with that tab guy at any rate and they make a bad mix.

...

...
...
- It also looks like you've broken the semantics of size<0 in read().

I was referring to this piece of code:

! if buf_len > size: ! self._rbuf.append (data[size:]) ! data = data[:size]

Here data[size:] gives you the last byte of the data and data[:size] chops off the last byte.

Ok. This has been fixed. All read sizes now work and have been tested by me.

...

...
...
- Maybe changing the write buffer to a list makes sense too?

I could do this. Then just do a join before the flush. Is the append /that/ much faster?

Depends on how small the chunks are you write. Roughly, repeated list append is O(N log N), while repeated string append is O(N**2).

Done. The write buffer now uses a list, so it should be faster than the initial version and the one currently in use.

...

OK, but given the issues the first version had, I recommand that the code gets more review and that you write unit tests for all cases.

I agree. I wasn't through enough in my checking. I'm going to see if I can include a test-case specifically to test the windows file class directly.

...

...
...
- Please don't introduce more static functions with a 'Py' name prefix.

Only did this in one place, with PyTimeout_Err. The reason was that the other Error functions used the Py prefix, so it was done for consistency. I can change that.. or change the naming scheme with the others if you like.

I like to do code cleanup that doesn't change semantics (like renamings) as a separate patch and checkin. You can do this before or after the timeout changes, but don't merge it into the timeout changes. I still like the static names that you introduce not to start with Py.

Ok. I'll change the PyTimeout_Err to just timeout_err. We can do some other cleanup after the patch has been accepted. It's big enough as is and no need to add more complication.

...

OK, it looks like you call internal_setblocking(s, 0) to set the socket in nonblocking mode. (Hm, I don't see any calls to set the socket in blocking mode!)

So do I understand that you are now always setting the socket in non-blocking mode, even when there is no timeout specified, and that you look at the sock_blocking flag to decide whether to do timeouts or just pass the nonblocking behavior to the user?

This is a change in semantics, and could interfere with existing applications that pass the socket's file descriptor off to other code. I think I'd be happier if the behavior wasn't changed at all until a timeout is set for a socket -- then existing code won't break.

So, the best way to proceed seems to be: if (s->sock_timeout == Py_None) /* Perhaps do nothing, or just do original behavior */ else /* Get funky. Do one of the solutions discussed below */

...

I only really care for sockets passed in to fromfd(). E.g. someone can currently do:

s1 = socket(AF_INET, SOCK_STREAM) s1.setblocking(0)

s2 = fromfd(s1.fileno()) # Now s2 is non-blocking too

I'd like this to continue to work as long as s1 doesn't set a timeout.

I see the issue. We'll worry about this and not ioctl. So let's look at solutions:

...

...
One solution is to set/unset blocking mode right before doing each call to be sure of the state and based on the internally stored value of the blocking attribute... but... then that kind of renders ioctl useless.

Don't worry so much about ioctl, but do worry about fromfd.

Not so popular.

...

...
Another solution might be to set the blocking mode to on everytime someone sets a timeout. That would change the blocking/socket interaction already described a bit but not drastically. Also easy to implement. That sends the message: Don't use ioctls when using timeouts.

I like this.

Alright. Well, using the above pseudo-code scheme, we should be alright. So here are the new semantics: If you set_timeout(int/float/long != None): The actual socket gets put in non-blocking mode and the usual select stuff is done. If you set_timeout(None): The old behavior is used AND automatically, the socket is set to blocking mode. That means that someone who was doing non-blocking stuff before, sets a timeout, and then unsets one, will have to do a set_blocking call again if he wants non-blocking stuff. This makes sense 'cause timeout stuff is blocking by nature. That seems fairest and we always have an idea of what state we're in. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Guido van Rossum

4:33 p.m.

...

Ok. Done. One day, you can explain to me why you despise whitespace so. Perhaps she was mean to you or something. She's always hanging around with that tab guy at any rate and they make a bad mix.

I like the whitespace use in the English language (like so) best.

...

Ok. This has been fixed. All read sizes now work and have been tested by me.

Have you written unit tests? That would be really great. Ideally, the tests should pass both before and after your patches.

...

So, the best way to proceed seems to be:

if (s->sock_timeout == Py_None) /* Perhaps do nothing, or just do original behavior */ else /* Get funky. Do one of the solutions discussed below */

Yes.

...

So here are the new semantics:

If you set_timeout(int/float/long != None): The actual socket gets put in non-blocking mode and the usual select stuff is done. If you set_timeout(None): The old behavior is used AND automatically, the socket is set to blocking mode. That means that someone who was doing non-blocking stuff before, sets a timeout, and then unsets one, will have to do a set_blocking call again if he wants non-blocking stuff. This makes sense 'cause timeout stuff is blocking by nature.

Sounds good! --Guido van Rossum (home page: http://www.python.org/~guido/)

Michael Gilfix

5:07 p.m.

On Wed, Jun 05 @ 17:33, Guido van Rossum wrote:

...

Done. I've added them into the test_socket.py test as I didn't feel like starting a new test that does roughly the same thing. Works on both the old (2.1.3 source I had lying around my system) and the new. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Michael Gilfix

5:25 p.m.

Ok. The new version of the patch is in the sourceforge tracker. Hopefully I haven't forgotten anything. Enjoy all. -- Mike -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Guido van Rossum

May 2002

3:13 p.m.

...

Michael Gilfix

2:50 p.m.

...

Sorry for taking such a long time to review this. (Apparently nobody else had time either.) And thanks for doing this! I think this is a great start. Rather than typing them in the SF bug manager issue I am mailing them (I'll add a link to the archived version of this message to the SF bug manager to track the activity).

General style nits:

- You submitted a reverse diff! Customary is diff(old, new).

- Please don't put a space between the function name and the open parenthesis. (You do this both in C and in Python code.)

- Also please don't put a space between the open parenthesis and the first argument (you do this almost consistently in test_timeout.py).

- Please don't introduce lines longer than 78 columns.

Feedback on the patch to socket.py:

- I think you're changing the semantics of unbuffered and line-buffered reads/writes on Windows. For one thing, you no longer implement line-buffered writes correctly. The idea is that if the buffer size is set to 1, data is flushed at \n only, so that if the code builds up the line using many small writes, this doesn't result in many small sends. There was code for this in write() -- why did you delete it?

- It also looks like you've broken the semantics of size<0 in read().

- Maybe changing the write buffer to a list makes sense too?

- Since this code appears independent from the timeoutsocket code, maybe we can discuss this separately?

Feedback on the documentation:

- I would document that the argument to settimeout() should be int or float (hm, can it be a long? that should be acceptable even if it's strange), and that the return value of gettimeout() is None or a float.

- You may want to document the interaction with blocking mode.

Feedback on the C socket module changes:

- Why import the select module? It's much more efficient to use the select system call directly. You don't need all the overhead and generality that the select module adds on top of it, and it costs a lot (select allocates lots of objects and lots of memory and hence is very expensive).

- <errno.h> is already included by Python.h.

- Please don't introduce more static functions with a 'Py' name prefix.

- You should raise TypeError if the type of the argument is wrong, and ValueError if the value is wrong (out of range). Not SocketError.

- I believe that you can't reliably maintain a "sock_blocking" flag; there are setsockopt() or ioctl() calls that can make a socket blocking or non-blocking. Also, there's no way to know whether a socket passed to fromfd() is in blocking mode or not.

- There are refcount bugs. I didn't do a detailed review of these, but I note that the return value from PyFloat_FromDouble() in PySocketSock_settimeout() is leaked. (There's an INCREF that's only needed for the None case.)

- The function internal_select() is *always* used like this:

count = internal_select (s, 1); if (count < 0) return NULL; else if (count == 0) /* Timeout elapsed */ return PyTimeout_Err ();

If internal_select() called PyTimeout_Err() itself, all call sites could be simplified to this:

count = internal_select (s, 1); if (count < 0) return NULL;

or even (now that the count variable is no longer needed) to this:

if(internal_select (s, 1) < 0) return NULL;

- The accept() wrapper contains this bit of code (only showing the Unix variant):

if (newfd < 0) if (!s->sock_blocking || (errno != EAGAIN && errno != EWOULDBLOCK)) return s->errorhandler ();

Isn't the sense of testing s->sock_blocking wrong? I would think that if we're in blocking mode we'd want to return immediately without even checking the errno. I recommend writing this out more clearly, e.g. like this:

if (s->sock_blocking) return s->errorhandler(); /* A non-blocking accept() failed */ if (errno != EAGAIN && errno != EWOULDBLOCK) return s->errorhandler();

- What is s->errorhandler() for? AFAICT, this is always equal to PySocket_Err!

- The whole interaction between non-blocking mode and timeout mode is confusing to me. Are you sure that this always does the right thing? Have you even thought about what "the right thing" is in all 4 combinations?

--Guido van Rossum (home page: http://www.python.org/~guido/) `-> (guido)

-- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Michael Gilfix

June 2002

10:22 a.m.

Alrighty. Here's the monster reply. I'll be much faster with the replies this week. Had a hectic week last week. On Thu, May 23 @ 16:13, Guido van Rossum wrote:

...

General style nits:

- You submitted a reverse diff! Customary is diff(old, new).

Oops. Will fix that this time round.

...

- Please don't put a space between the function name and the open parenthesis. (You do this both in C and in Python code.)

Fixed. Some personal preference bled it there. All removed in my copy.

...

- Also please don't put a space between the open parenthesis and the first argument (you do this almost consistently in test_timeout.py).

...

- Please don't introduce lines longer than 78 columns.

Fixed my offending line. I've also corrected some other lines in the socket module that went over 78 columns (there were a few).

...

Feedback on the patch to socket.py:

- I think you're changing the semantics of unbuffered and line-buffered reads/writes on Windows. For one thing, you no longer implement line-buffered writes correctly. The idea is that if the buffer size is set to 1, data is flushed at \n only, so that if the code builds up the line using many small writes, this doesn't result in many small sends. There was code for this in write() -- why did you delete it?

...

- It also looks like you've broken the semantics of size<0 in read().

...

- Maybe changing the write buffer to a list makes sense too?

I could do this. Then just do a join before the flush. Is the append /that/ much faster?

...

- Since this code appears independent from the timeoutsocket code, maybe we can discuss this separately?

...

Feedback on the documentation:

- I would document that the argument to settimeout() should be int or float (hm, can it be a long? that should be acceptable even if it's strange), and that the return value of gettimeout() is None or a float.

It can be a long in my local copy. The argument can be any numeric value and the special None. I've updated my documentation to be more explicit.

...

- You may want to document the interaction with blocking mode.

...

Feedback on the C socket module changes:

- Why import the select module? It's much more efficient to use the select system call directly. You don't need all the overhead and generality that the select module adds on top of it, and it costs a lot (select allocates lots of objects and lots of memory and hence is very expensive).

...

- <errno.h> is already included by Python.h.

I didn't do this but it's been removed.

...

- Please don't introduce more static functions with a 'Py' name prefix.

...

- You should raise TypeError if the type of the argument is wrong, and ValueError if the value is wrong (out of range). Not SocketError.

Oops. Fixed.

...

- I believe that you can't reliably maintain a "sock_blocking" flag; there are setsockopt() or ioctl() calls that can make a socket blocking or non-blocking. Also, there's no way to know whether a socket passed to fromfd() is in blocking mode or not.

...

- There are refcount bugs. I didn't do a detailed review of these, but I note that the return value from PyFloat_FromDouble() in PySocketSock_settimeout() is leaked. (There's an INCREF that's only needed for the None case.)

This has been fixed. I was one ref count too high in my scheme.

...

- The function internal_select() is *always* used like this:

count = internal_select (s, 1); if (count < 0) return NULL; else if (count == 0) /* Timeout elapsed */ return PyTimeout_Err ();

If internal_select() called PyTimeout_Err() itself, all call sites could be simplified to this:

count = internal_select (s, 1); if (count < 0) return NULL;

or even (now that the count variable is no longer needed) to this:

if(internal_select (s, 1) < 0) return NULL;

Good point. Except the return value needs to be checked for <= 0 in this case. Changes were made.

...

- The accept() wrapper contains this bit of code (only showing the Unix variant):

if (newfd < 0) if (!s->sock_blocking || (errno != EAGAIN && errno != EWOULDBLOCK)) return s->errorhandler ();

Isn't the sense of testing s->sock_blocking wrong? I would think that if we're in blocking mode we'd want to return immediately without even checking the errno. I recommend writing this out more clearly, e.g. like this:

if (s->sock_blocking) return s->errorhandler(); /* A non-blocking accept() failed */ if (errno != EAGAIN && errno != EWOULDBLOCK) return s->errorhandler();

...

- What is s->errorhandler() for? AFAICT, this is always equal to PySocket_Err!

This was always in the module. Not sure why it was put there intially. I used it to be consistent.

...

- The whole interaction between non-blocking mode and timeout mode is confusing to me. Are you sure that this always does the right thing? Have you even thought about what "the right thing" is in all 4 combinations?

Guido van Rossum

12:22 p.m.

[Addressing only points that need attention]

...

...
- Also please don't put a space between the open parenthesis and the first argument (you do this almost consistently in test_timeout.py).

Couldn't really figure out what you were seeing here. I read that you saw something like func( a, b), which I don't see in my local copy.

test_timeout.py from the SF page has this. I'm glad you fixed this already in your own copy.

...

I do have something like this for list comprehension:

[ x.find('\n') for x in self._rbuf ]

Er, but I though there were supposed to be surrounding spaces at the edges...

I prefer to see that as [x.find('\n') for x in self._rbuf]

...

I screwed up the write. New the write is:

[...]

...

which is pretty much the same as the old. The read should be ok though. I could really use someone with a win compiler to test this for me.

I'll review it more when you next upload it.

...

...
- It also looks like you've broken the semantics of size<0 in read().

Maybe I'm misunderstanding the code, but I thought that a size < 0 meant to read until there are no more? The statement:

while size < 0 or buf_len < size:

accomplishes the same thing as what's in the current socket.py implementation. If you look closely, the 'if >= 0' branch *always* returns, meaning that the < 0 is equiv to while 1. Due to shortcutting, the same thing happens in the above statement. Maybe a comment would make it clearer?

...

...
- Maybe changing the write buffer to a list makes sense too?

I could do this. Then just do a join before the flush. Is the append /that/ much faster?

Depends on how small the chunks are you write. Roughly, repeated list append is O(N log N), while repeated string append is O(N**2).

...

...
- Since this code appears independent from the timeoutsocket code, maybe we can discuss this separately?

The point of this code was to keep from losing data when an exception occurs (as timothy, if I remember correctly, pointed out). Hence the reason for keeping a lot more data around in instance variables instead of local variables. So the windows version might (in obscure cases) be affected by the timeout changes. That's what this patch was addressing.

OK, but given the issues the first version had, I recommand that the code gets more review and that you write unit tests for all cases.

...

...
- Please don't introduce more static functions with a 'Py' name prefix.

Only did this in one place, with PyTimeout_Err. The reason was that the other Error functions used the Py prefix, so it was done for consistency. I can change that.. or change the naming scheme with the others if you like.

...

...
- I believe that you can't reliably maintain a "sock_blocking" flag; there are setsockopt() or ioctl() calls that can make a socket blocking or non-blocking. Also, there's no way to know whether a socket passed to fromfd() is in blocking mode or not.

Well, upon socket creation (in init_sockobject), the socket is set to blocking mode. I think that handles fromfd, right? Doesn't every initialization means have to call that function?

...

The real problem would be someone using an ioctl or setsockopt (Can you even do blocking stuff through setsockopt?).

Yes, setblocking() makes a call to setsockopt(). :-)

...

Ugh. The original timeoutsocket didn't really deal with anything like that. Erm, seems like an interface problem here - using ioctl kinda breaks the socket object interface. Perhaps we should be doing some sort of getsockopt to figure out the blocking mode and update our state accordingly? That would be an extra call for each thing to the interface though.

...

One solution is to set/unset blocking mode right before doing each call to be sure of the state and based on the internally stored value of the blocking attribute... but... then that kind of renders ioctl useless.

Don't worry so much about ioctl, but do worry about fromfd.

...

Another solution might be to set the blocking mode to on everytime someone sets a timeout. That would change the blocking/socket interaction already described a bit but not drastically. Also easy to implement. That sends the message: Don't use ioctls when using timeouts.

I like this.

...

Hmm.. Will need to think about this more. Any insight would be helpful or some wisdom about how you usually handle this sort of thing.

...

...
- What is s->errorhandler() for? AFAICT, this is always equal to PySocket_Err!

This was always in the module. Not sure why it was put there intially. I used it to be consistent.

Argh, you're right. MAL added this; I'll ask him why.

...

If you made it this far, it's time for coffee.

When can I expect a new version? --Guido van Rossum (home page: http://www.python.org/~guido/)

Tim Peters

12:32 p.m.

[Guido]

...

...
...
- Maybe changing the write buffer to a list makes sense too?

[mgilfix@eecs.tufts.edu]

...

...
I could do this. Then just do a join before the flush. Is the append /that/ much faster?

[Guido]

...

Depends on how small the chunks are you write. Roughly, repeated list append is O(N log N), while repeated string append is O(N**2).

Michael Gilfix

1:39 p.m.

On Mon, Jun 03 @ 13:22, Guido van Rossum wrote:

...

...
If you made it this far, it's time for coffee.

When can I expect a new version?

Give me a day or two to address these points and produce the new version. -- Michael Gilfix mgilfix@eecs.tufts.edu For my gpg public key: http://www.eecs.tufts.edu/~mgilfix/contact.html

Michael Gilfix

June 2002

2:52 p.m.

On Mon, Jun 03 @ 13:22, Guido van Rossum wrote:

...

...
Couldn't really figure out what you were seeing here. I read that you saw something like func( a, b), which I don't see in my local copy.

test_timeout.py from the SF page has this. I'm glad you fixed this already in your own copy.

Weird. I didn't change anything. Oh well. We'll see if it shows up in the new patch this time round.

...

...
I do have something like this for list comprehension:

[ x.find('\n') for x in self._rbuf ]

Er, but I though there were supposed to be surrounding spaces at the edges...

I prefer to see that as

[x.find('\n') for x in self._rbuf]

Ok. Done. One day, you can explain to me why you despise whitespace so. Perhaps she was mean to you or something. She's always hanging around with that tab guy at any rate and they make a bad mix.

...

...
...
- It also looks like you've broken the semantics of size<0 in read().

I was referring to this piece of code:

! if buf_len > size: ! self._rbuf.append (data[size:]) ! data = data[:size]

Here data[size:] gives you the last byte of the data and data[:size] chops off the last byte.

Ok. This has been fixed. All read sizes now work and have been tested by me.

...

...
...
- Maybe changing the write buffer to a list makes sense too?

I could do this. Then just do a join before the flush. Is the append /that/ much faster?

Depends on how small the chunks are you write. Roughly, repeated list append is O(N log N), while repeated string append is O(N**2).

Done. The write buffer now uses a list, so it should be faster than the initial version and the one currently in use.

...

OK, but given the issues the first version had, I recommand that the code gets more review and that you write unit tests for all cases.

I agree. I wasn't through enough in my checking. I'm going to see if I can include a test-case specifically to test the windows file class directly.

...

...
...
- Please don't introduce more static functions with a 'Py' name prefix.

Only did this in one place, with PyTimeout_Err. The reason was that the other Error functions used the Py prefix, so it was done for consistency. I can change that.. or change the naming scheme with the others if you like.

I like to do code cleanup that doesn't change semantics (like renamings) as a separate patch and checkin. You can do this before or after the timeout changes, but don't merge it into the timeout changes. I still like the static names that you introduce not to start with Py.

Ok. I'll change the PyTimeout_Err to just timeout_err. We can do some other cleanup after the patch has been accepted. It's big enough as is and no need to add more complication.

...

OK, it looks like you call internal_setblocking(s, 0) to set the socket in nonblocking mode. (Hm, I don't see any calls to set the socket in blocking mode!)

So do I understand that you are now always setting the socket in non-blocking mode, even when there is no timeout specified, and that you look at the sock_blocking flag to decide whether to do timeouts or just pass the nonblocking behavior to the user?

This is a change in semantics, and could interfere with existing applications that pass the socket's file descriptor off to other code. I think I'd be happier if the behavior wasn't changed at all until a timeout is set for a socket -- then existing code won't break.

So, the best way to proceed seems to be: if (s->sock_timeout == Py_None) /* Perhaps do nothing, or just do original behavior */ else /* Get funky. Do one of the solutions discussed below */

...

I only really care for sockets passed in to fromfd(). E.g. someone can currently do:

s1 = socket(AF_INET, SOCK_STREAM) s1.setblocking(0)

s2 = fromfd(s1.fileno()) # Now s2 is non-blocking too

I'd like this to continue to work as long as s1 doesn't set a timeout.

I see the issue. We'll worry about this and not ioctl. So let's look at solutions:

...

...
One solution is to set/unset blocking mode right before doing each call to be sure of the state and based on the internally stored value of the blocking attribute... but... then that kind of renders ioctl useless.

Don't worry so much about ioctl, but do worry about fromfd.

Not so popular.

...

...
Another solution might be to set the blocking mode to on everytime someone sets a timeout. That would change the blocking/socket interaction already described a bit but not drastically. Also easy to implement. That sends the message: Don't use ioctls when using timeouts.

I like this.

Guido van Rossum

4:33 p.m.

...

Ok. Done. One day, you can explain to me why you despise whitespace so. Perhaps she was mean to you or something. She's always hanging around with that tab guy at any rate and they make a bad mix.

I like the whitespace use in the English language (like so) best.

...

Ok. This has been fixed. All read sizes now work and have been tested by me.

Have you written unit tests? That would be really great. Ideally, the tests should pass both before and after your patches.

...

So, the best way to proceed seems to be:

if (s->sock_timeout == Py_None) /* Perhaps do nothing, or just do original behavior */ else /* Get funky. Do one of the solutions discussed below */

Yes.

...

So here are the new semantics:

If you set_timeout(int/float/long != None): The actual socket gets put in non-blocking mode and the usual select stuff is done. If you set_timeout(None): The old behavior is used AND automatically, the socket is set to blocking mode. That means that someone who was doing non-blocking stuff before, sets a timeout, and then unsets one, will have to do a set_blocking call again if he wants non-blocking stuff. This makes sense 'cause timeout stuff is blocking by nature.

Sounds good! --Guido van Rossum (home page: http://www.python.org/~guido/)

Michael Gilfix

5:07 p.m.

On Wed, Jun 05 @ 17:33, Guido van Rossum wrote:

...

Michael Gilfix

5:25 p.m.

8298

Age (days ago)

8322

Last active (days ago)

List overview

Download

10 comments

3 participants

participants (3)

Guido van Rossum
Michael Gilfix
Tim Peters

Socket timeout patch

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

Michael Gilfix

tags

participants (3)