urllib2 spinning CPU on read
kdotsky at gmail.com
Sun Nov 26 09:54:43 CET 2006
I've ran into this problem on several sites where urllib2 will hang
(using all the CPU) trying to read a page. I was able to reproduce it
for one particular site. I'm using python 2.4
url = 'http://www.wautomas.info'
request = urllib2.Request(url)
opener = urllib2.build_opener()
result = opener.open(request)
data = result.read()
It never returns from this read call.
I did some profiling to try and see what was going on and make sure it
wasn't my code. There was a huge number of calls to (and amount of
time spent in) socket.py:315(readline) and to recv. A large amount of
time was also spent in httplib.py:482(_read_chunked). Here's the
significant part of the statistics:
32564841 function calls (32563582 primitive calls) in 545.250
Ordered by: internal time
List reduced from 416 to 50 due to restriction <50>
ncalls tottime percall cumtime percall filename:lineno(function)
10844775 233.920 0.000 447.440 0.000 socket.py:315(readline)
10846078 152.430 0.000 152.430 0.000 :0(recv)
3 97.330 32.443 544.730 181.577
10844812 61.090 0.000 61.090 0.000 :0(join)
Also, where should I go to see if something like this has already been
reported as a bug?
Thanks for any help you can give me.
More information about the Python-list