[ python-Bugs-876637 ] Random stack corruption from socketmodule.c

SourceForge.net noreply at sourceforge.net
Tue Feb 7 08:18:12 CET 2006


Bugs item #876637, was opened at 2004-01-13 22:41
Message generated for change (Comment added) made by nnorwitz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=876637&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
>Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Mike Pall (mikesfpy)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: Random stack corruption from socketmodule.c 

Initial Comment:
THE PROBLEM:

The implementation of the socket_object.settimeout() method
(socketmodule.c, function internal_select()) uses the select() system
call with an unbounded file descriptor number. This will cause random
stack corruption if fd>=FD_SETSIZE.

This took me ages to track down! It happened with a massively 
multithreaded
and massively connection-swamped network server. Basically most of 
the
descriptors did not use that routine (because they were either pure 
blocking
or pure non-blocking). But one module used settimeout() and with a little
bit of luck got an fd>=FD_SETSIZE and with even more luck corrupted 
the
stack and took down the whole server process.

Demonstration script appended.

THE SOLUTION:

The solution is to use poll() and to favour poll() even if select()
is available on a platform. The current trend in modern OS+libc
combinations is to emulate select() in libc and call kernel-level poll()
anyway. And this emulation is costly (both for the caller and for libc).

Not so the other way round (only some systems of historical interest
do that BTW), so we definitely want to use poll() if it's available
(even if it's an emulation).

And if select() is your only choice, then check for FD_SETSIZE before
using the FD_SET macro (and raise some strange exception if that fails).

[
I should note that using SO_RCVTIMEO and SO_SNDTIMEO would be a lot 
more
efficient (kernel-wise at least). Unfortunately they are not universally
available (though defined by most system header files). But a simple
runtime test with a fallback to poll()/select() would do.
]

A PATCH, A PATCH?

Well, the check for FD_SETSIZE is left as an exercise for the reader. :-)
Don't forget to merge this with the stray select() way down by adding 
a return value to internal_select().

But yes, I can do a 'real' patch with poll() [and even one with the
SO_RCVTIMEO trick if you are adventurous]. But, I can't test it with
dozens of platforms, various include files, compilers and so on.

So, dear Python core developers: Please discuss this and tell me,
if you want a patch, then you'll get one ASAP.

Thank you for your time!


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2006-02-06 23:18

Message:
Logged In: YES 
user_id=33168

Thanks!

Committed revision 42253.
Committed revision 42254. (2.4)


----------------------------------------------------------------------

Comment By: Troels Walsted Hansen (troels)
Date: 2004-06-10 04:15

Message:
Logged In: YES 
user_id=32863

I have created a patch to make socketmodule use poll() when
available. See http://python.org/sf/970288

(I'm not allowed to attach patches to this bug item.)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=876637&group_id=5470


More information about the Python-bugs-list mailing list