urllib.urlretrieve never returns???

Tue Mar 20 03:08:12 EDT 2012

Here you can find the example program and the original post.

http://code.activestate.com/lists/python-list/617894/
>
> I gather you are running urlretrieve in a separate thread, inside a GUI?
Yes.
>
> I have learned that whenever I have inexplicable behaviour in a function,
> I should check my assumptions. In this case, (1) are you sure you have
> the right urlretrieve, and (2) are you sure that your self.Log() method
> is working correctly? Just before the problematic call, do this:
>
> # was:
> fpath = urllib.urlretrieve(imgurl)[0]
>
> # becomes:
> print(urllib.__file__, urlretrieve)
> self.Log(urllib.__file__, urlretrieve)
> fpath = urllib.urlretrieve(imgurl)[0]
I called self.Log() after each line, and also from a general "except:" 
clause. Definitely, the line after urlretrieve is not executed, and no 
exception is raised. Number of threads goes up (visible from task manager).

It is true that the program uses another module that uses the socket 
module and multiple threads. (These are written in pure python.)

If I remove the other module, then there is no error, however it renders 
the application useless. If I start the program with a console (e.g. 
with python.exe instead of pythonw.exe) then it works. Looks like 
opening a console solves the problem, although nothing is ever printed 
on the console.
> and ensure that you haven't accidentally shadowed them with something
> unexpected. Does the output printed to the console match the output
> logged?
Well, this cannot be tested. If there is a console, then there is no 
problem.
>
> What happens if you take the call to urlretrieve out of the thread and
> call it by hand?
Then it works.
> Run urllib.urlretrieve(imgurl) directly in the
> interactive interpreter. Does it still hang forever?
Then it works perfectly.
>
> When you say it "never" returns, do you mean *never* or do you mean "I
> gave up waiting after five minutes"? What happens if you leave it to run
> all day?
I did not try that. But I have already set socket timeout to 10 seconds, 
and definitely it is not waiting for a response from the server.
>
> How big are the files you are trying to retrieve?
34 KB
> Try retrieving a really small file. Then try retrieving a non-existent file.
Good point. I'll try to retrieve a nonexistent file when I get home. :)
>
> What happens if you call urlretrieve with a reporthook argument? Does it
> print anything?
I'll try this too. I'll also try using pycurl or the low level socket 
module instead.
>
> What happens if you try to browse to imgurl in your web browser? Are you
> sure the problem is with urlretrieve and not the source?
Yes.