FTP hangs with urllib.urlretrieve
Oleg Broytmann
phd at phd.russ.ru
Tue Mar 7 05:15:43 EST 2000
Hello!
I am developing/running URL checker (sorry, yet another one :). And
found that I have consistent hangings on some FTP URLs.
Here is test program I am using to retest my results - just URLopener
and urlretrieve, nothing magical; python is stock 1.5.2, hangs on all
platforms I am using - Pentium Linux, Sparc Solaris, Pentium FreeBSD:
----------
#! /usr/local/bin/python -O
import sys, urllib
urllib._urlopener = urllib.URLopener()
# Some sites allow only Mozilla-compatible browsers; way to stop robots?
server_version = "Mozilla/3.0 (compatible; Python-urllib/%s)" % urllib.__version__
urllib._urlopener.addheaders[0] = ('User-agent', server_version)
url = sys.argv[1]
print "Testing", url
try:
fname, headers = urllib.urlretrieve(url)
print fname
print headers
except Exception, msg:
print msg
import traceback; traceback.print_exc()
----------
The program always hangs on some (but not all) FTP URLs. One is
well-known for Python community: ftp://starship.python.net/pub/crew/jam/ :)
Others are:
ftp://ftp.sai.msu.su/
ftp://ftp.radio-msu.net/
ftp://ftp.relcom.ru/pub/
ftp://ftp.sunet.se/pub/
ftp://ftp.cs.wisc.edu/
ftp://ftp.cert.org/pub/
I've tested these sites with FTP clients (Midnight Commander, Netscape
Navigator, ncftp) - all are accessible. It seems like a bug or bugs in
ftplib.
The first two are very near to me in terms of Internet distance (hop
counts), so timeouts should not be a problem. These are sites near my ISP
(Radio MSU, in Moscow State University).
Can anyone with better knowledge of FTP protocol look and help? Does
latest python (from CVS) perform better (if anyone willing to test)?
Oleg.
----
Oleg Broytmann Foundation for Effective Policies phd at phd.russ.ru
Programmers don't die, they just GOSUB without RETURN.
More information about the Python-list
mailing list