[Python-bugs-list] [ python-Bugs-774751 ] slow socket binding & netinfo lookups

SourceForge.net noreply@sourceforge.net
Tue, 22 Jul 2003 13:05:07 -0700


Bugs item #774751, was opened at 2003-07-20 18:57
Message generated for change (Comment added) made by montanaro
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=774751&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: Postponed
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Nobody/Anonymous (nobody)
Summary: slow socket binding & netinfo lookups

Initial Comment:
The following code takes < 1 second to run under Python 
2.1, but 20 seconds to run under Python 2.2 or 2.3 on a top-
end PowerBook running OSX 10.2. It is part of the Zope 
startup routine.

def max_server_sockets():
    sl = []
    while 1:
        try:
            s = socket.socket (socket.AF_INET, 
socket.SOCK_STREAM)
            s.bind (('',0))
            s.listen(5)
            sl.append (s)
        except:
            break
    num = len(sl)
    for s in sl:
        s.close()
    del sl
    return num



----------------------------------------------------------------------

>Comment By: Skip Montanaro (montanaro)
Date: 2003-07-22 15:05

Message:
Logged In: YES 
user_id=44345

I assigned it to you at least in part because your footprints
were in the vicinity of the slow call.  We've decided to just
let it drop until after the release, so feel free to unassign
it or direct it back to me.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-07-22 14:32

Message:
Logged In: YES 
user_id=92689

Why on earth was this assigned to me? I have nothing to do with 
the slowness in 2.2, I fixed a thread issue for 2.3, and that was 
already more than I intended. I know nothing about getaddrinfo.

----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-07-22 09:27

Message:
Logged In: YES 
user_id=12800

Apple also said getaddrinfo has been "thread safe for a
while", right?  It sounds to me like our approach should be
to write the code as if it were <wink> and as if it were
fast, and assume Apple has/will get their act together on this.

Agreed that we should do nothing here for Python 2.3.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-07-22 09:08

Message:
Logged In: YES 
user_id=44345

Putting this on the back burner for now and assigning to Just as a
present for when he returns ;-).  I haven't been able to resolve
the problem and Apple tells me that getaddrinfo was completely
rewritten for 10.3 (Panther).

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-07-21 21:48

Message:
Logged In: YES 
user_id=44345

Another little observation.  I ran the getaddr program (with the
loop cranked up to 10 million iterations) while the top command
was running.  The lookupd process didn't show on the radar at
all.  While running the Python script however, lookupd
consumed about 50% of the cpu.


----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-07-21 21:42

Message:
Logged In: YES 
user_id=44345

I think we've concluded that it's okay to ship 2.3 with this problem
unresolved, however I'm completely befuddled at the moment.  I 
wrote this
simple program:

    #include <sys/time.h>
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <netdb.h>
    #include <stdio.h>

    int
    main(int argc, char **argv) {
	    int i;
	    struct addrinfo hints, *res;
	    int error;
	    struct timeval t, u;

	    hints.ai_family = AF_INET;
	    hints.ai_socktype = SOCK_DGRAM;	/*dummy*/
	    hints.ai_flags = AI_PASSIVE;

	    printf("start\n");
	    for (i=0; i<100; i++) {
		    gettimeofday(&t, NULL);
		    error = getaddrinfo(0, "0", &hints, &res);
		    gettimeofday(&u, NULL);
		    fprintf(stderr, "gtod: %.6f\n",
			    u.tv_sec-t.tv_sec+(u.tv_usec-t.tv_usec)*1e-6);
		    freeaddrinfo(res);
	    }
	    printf("finish\n");
    }

When run on my Mac, it takes between 2 and 7 microseconds per 
getaddrinfo
call.  The exact same instrumented call inside 
socketmodule.c:setipaddr at
line 700 takes about 150 *milli*seconds per call.  I tried 
eliminating the
Py_{BEGIN,END}_ALLOW_THREADS calls as well as the ACQUIRE 
and RELEASE of the
getaddrinfo lock.  That had no effect on the call.

I also tweaked the compile like to run just the C preprocessor (gcc 
-E
instead of gcc -g) and checked the output:

	        ...
                hints.ai_family = af;
                hints.ai_socktype = 2;
                hints.ai_flags = 0x00000001;
                { PyThreadState *_save; _save = 
PyEval_SaveThread();
                PyThread_acquire_lock(netdb_lock, 1);
                gettimeofday(&t, 0);
                error = getaddrinfo(0, "0", &hints, &res);
                gettimeofday(&u, 0);
                fprintf((&__sF[2]), "gtod: %.6f\n",
                        u.tv_sec-t.tv_sec+(u.tv_usec-t.tv_usec)*1e-6);
                PyEval_RestoreThread(_save); }
		...

otool -L run against both the test program and _socket.so indicate 
that both
refer to just one shared library (/usr/lib/libSystem.B.dylib), so 
both
should be calling the same routine.


----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-07-21 15:52

Message:
Logged In: YES 
user_id=45365

I have absolutely nothing to contribute wrt. this bug. As Just is 
away I guess Skip is the best target for the hot patato. 

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-07-21 11:33

Message:
Logged In: YES 
user_id=44345

Digging a bit deeper, it appears on Mac OS X that
fake_getaddrinfo is used.  That's deemed not to be thread-safe.
I see three static or global variables acessed by this function:

    firsttime - simple guard
    translate - set inside the guard - readonly after that
    faith_prefix - same

Why not push the guard and initialization code into init_socket
and protect it with the getaddrinfo lock?  Fake_getaddrinfo
would thus be thread-safe and could be accessed without
acquiring a lock.


----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-07-21 11:14

Message:
Logged In: YES 
user_id=44345

whoops...  my apologies for not posting here.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-07-21 10:50

Message:
Logged In: YES 
user_id=31435

Attaching copious info from Skip Montanaro (skip.txt).

----------------------------------------------------------------------

Comment By: Anthony Baxter (anthonybaxter)
Date: 2003-07-21 02:05

Message:
Logged In: YES 
user_id=29957

Is this something that should be looked at for 2.3rc2?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=774751&group_id=5470