One possibility is that the Linux getaddrinfo() is thread-safe, but only by way of a lock that only allows one request to be outstanding at a time.
The next step should be to get the getaddrinfo() source code from glibc and see what it does. It's open source, hey. :-)
I can dig around a bit, but I have to figure out what I'm looking for. On the failure platform, are we sure Python is using the native getaddrinfo, not the Python-supplied one? I've had some fun (not) with the latter; for working on an LSB-conforming version of Python, I can't let it use the glibc version of getaddrinfo because it's not in the spec (will be in the next version); but the Python addrinfo.h header has some fields in different order than the Linux one, and it managed to call the Linux one anyway. The result of that was not subtle, however :-) so I don't think that's the problem that started this thread. I do know the Linux (or rather, glibc) getaddrinfo doesn't get rentrancy through magic, it calls gethostbyname_r and gethostbyaddr_r. (Note the Python emulation getaddrinfo just calls the straight gethostbyname and gethostbyaddr routines and so is likely not to be reentrant).
Mats Wichmann
One possibility is that the Linux getaddrinfo() is thread-safe, but only by way of a lock that only allows one request to be outstanding at a time.
The next step should be to get the getaddrinfo() source code from glibc and see what it does. It's open source, hey. :-)
I can dig around a bit, but I have to figure out what I'm looking for.
I think that part is already settled: getaddrinfo, on Linux, is thread-safe.
On the failure platform, are we sure Python is using the native getaddrinfo, not the Python-supplied one?
Correct. I think the remaining question is: Even if the GIL is released around getaddrinfo - why is the performance of Jeremy's test script still that bad? Regards, Martin
Mats Wichmann
writes: One possibility is that the Linux getaddrinfo() is thread-safe, but only by way of a lock that only allows one request to be outstanding at a time.
The next step should be to get the getaddrinfo() source code from glibc and see what it does. It's open source, hey. :-)
I can dig around a bit, but I have to figure out what I'm looking for.
[MvL]
I think that part is already settled: getaddrinfo, on Linux, is thread-safe.
On the failure platform, are we sure Python is using the native getaddrinfo, not the Python-supplied one?
Correct.
I think the remaining question is: Even if the GIL is released around getaddrinfo - why is the performance of Jeremy's test script still that bad?
I tried to read the glibc getaddrinfo() source, but it looks like it would be a term project... It could be that it's just doing a lot more interaction with a DNS server. I believe that Jeremy suspects that the test program isn't just slow, but that one slow thread actually blocks all other threads from making progress. If that's the case (we don't know for sure), we're looking for a bottleneck in the getaddrinfo() code that somehow holds a resource needed by all threads calling getaddrinfo(). --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (3)
-
Guido van Rossum
-
martin@v.loewis.de
-
Mats Wichmann