[ python-Bugs-1467080 ] Many functions in socket module are not thread safe
SourceForge.net
noreply at sourceforge.net
Tue Aug 22 01:43:20 CEST 2006
Bugs item #1467080, was opened at 2006-04-09 03:09
Message generated for change (Comment added) made by loewis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1467080&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Python 2.4
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: Maxim Sobolev (sobomax)
Assigned to: Nobody/Anonymous (nobody)
Summary: Many functions in socket module are not thread safe
Initial Comment:
The socket module make a great effort to be thread-safe, but misses one
big point - it uses single per-instance buffer to hold resolved
sockaddr_xxx structures. Therefore, on SMP system it is possible that
several threads calling functions that perform address resolution in
parallel will stomp on each other resulting in incorrect or invalid address
to be used in each case.
For example, two threads calling sendto() in parallel can result in packets
to be sent to incorrect addresses - packets from thread one from time to
time will be sent to address requested by thread two and vice versa.
Another, smaller issue is that the call to getnameinfo() is not protected
with netdb mutex on systems that don't have thread-safe resolver.
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2006-08-22 01:43
Message:
Logged In: YES
user_id=21627
The latest patch got resubmitted as #1544279.
----------------------------------------------------------------------
Comment By: Maxim Sobolev (sobomax)
Date: 2006-04-25 00:31
Message:
Logged In: YES
user_id=24670
OK, it looks like the only way to get this bug fixed is to reimplement patch to
use stack instead of heap, so that here we go. Attached please find new version
of the patch which allocates address buffer on stack.
-Maxim
----------------------------------------------------------------------
Comment By: Maxim Sobolev (sobomax)
Date: 2006-04-18 01:05
Message:
Logged In: YES
user_id=24670
> The big win is in simplification of the code.
What do you call "big simplification"? Several malloc/free
calls and appropriate NULL checks?
Aside from stack usage issues the big loss here is
portability - we either need to use some fixed length
buffer with the size sufficient to hold any type of address
(ugly!) or use sockaddr_storage, which may not exist on all
platforms supported by the python.
-Maxim
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2006-04-18 00:51
Message:
Logged In: YES
user_id=21627
The big win is in simplification of the code. Also, we are
not talking about 10k here. On Linux, sock_addr_t is 128
bytes. Take a look at set_error: it allocates 100 bytes for
an error buffer. Or sock_repr: 512 bytes for a buffer. Or
socket_gethostname: 1024 bytes. Or socket_gethostbyname_ex:
16384 bytes. socket_getaddrinfo: 30 bytes. os_init: 100 bytes
Or, looking at other modules: symtable.c:check_unoptimized:
300 bytes. Py_GetVersion: 250 bytes. PySys_SetArgv:
2*MAXPATHLEN+1 (on Linux, this is 8193 bytes). I could go on.
Don't worry about stack consumption. Or, if you do, analyse
the Python source code, and fix the big offenders first.
Premature optimization is the root of all evil.
----------------------------------------------------------------------
Comment By: Maxim Sobolev (sobomax)
Date: 2006-04-18 00:10
Message:
Logged In: YES
user_id=24670
> The address space available to each thread typically
doesn't
> depend on the number of threads. Instead, the stack size
is
> pre-determined, so it's vice versa: the number of threads
> supported depends on that stack-size, which (currently)
isn't tunable.
Yes, but my point is that on low-end and/or embedded system
the system can be configured with as low stack per thread
as possible (since with for example 100 threads, every
extra 10K of stack per thread means 1M of extra memory,
which in the absence of paging needs to be wired down) and
if only one thread needs this stack 990K of it will be
effectively wasted. And since getaddrinfo()-family already
uses heap for its results I don't see any big win in
avoiding extra malloc()/free() call. Expecially considering
that we are dealing with i/o here, so that system call
overhead will be much more than that anyway, even if the
program calls those functions heavily.
-Maxim
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2006-04-17 23:22
Message:
Logged In: YES
user_id=21627
The argument of "hundreds of threads" is a red herring. The
address space available to each thread typically doesn't
depend on the number of threads. Instead, the stack size is
pre-determined, so it's vice versa: the number of threads
supported depends on that stack-size, which (currently)
isn't tunable. Also, stack space is typically a scarce
resource only for recursive functions. For leave functions,
it doesn't matter, unless a single function consumes the
majority of the stack (which certainly wouldn't be the case
here).
Allocation on the stack is cheap, and over-allocation
doesn't hurt. Please change the patch to use automatic
variables. It will become simpler and more maintainable that
way (in addition, it should also become minimally faster).
----------------------------------------------------------------------
Comment By: Maxim Sobolev (sobomax)
Date: 2006-04-17 22:26
Message:
Logged In: YES
user_id=24670
> Why is it necessary to allocate the memory dynamically?
> Couldn't the caller provide the memory on the stack (i.e.
> through a local variable)?
Local variable is not very good IMHO since in threading
environment with hundreds of threads running at the same
time stack can be a scarce resource. Another issue is that
the caller doesn't know the actual size it has to allocate
until the resolution has been done, therefore it would need
to overallocate in most cases.
-Maxim
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2006-04-11 07:22
Message:
Logged In: YES
user_id=21627
Why is it necessary to allocate the memory dynamically?
Couldn't the caller provide the memory on the stack (i.e.
through a local variable)?
----------------------------------------------------------------------
Comment By: Maxim Sobolev (sobomax)
Date: 2006-04-11 02:08
Message:
Logged In: YES
user_id=24670
Sorry, for some reason the patch did not go through first time. See attached.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2006-04-09 23:18
Message:
Logged In: YES
user_id=21627
I wonder why the sock_addr member is there in the socket
object in the first place. AFAICT, there is no need for it;
it was introduced in r4509.
Would you like to work on a patch?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1467080&group_id=5470
More information about the Python-bugs-list
mailing list