[Tutor] Can't make urllib do ftp
Danny Yoo
dyoo@hkn.eecs.berkeley.edu
Sun, 18 Nov 2001 18:53:26 -0800 (PST)
On Sun, 18 Nov 2001, Danny Yoo wrote:
> That is, if urllib sees a proxy, it will "fix" the url to include that
> proxy host information, and that's the only thing I can think of that
> triggers your error message that deals with that TypeError:
>
> > File "/usr/lib/python2.1/urllib.py", line 427, in open_ftp
> > host, path = splithost(url)
> > File "/usr/lib/python2.1/urllib.py", line 932, in splithost
> > match = _hostprog.match(url)
> > TypeError: expected string or buffer
>
>
> Do you know if you have one set up? Can you check your evironment to see
> if you have an "FTP_PROXY" variable? If so, that can do something
> strange.
>
> I'll continue to read through urllib to see what the consequences of
> having an FTP proxy are. Hope this helps!
Ok, I've read through urllib.py a little more, and I'm almost convinced
this problem has to do with FTP proxies, because there's an ugly asymmetry
in the code that handles open_http() and open_ftp():
###
## Within urllib.py
def open_http(self, url, data=None):
"""Use HTTP protocol."""
import httplib
user_passwd = None
if type(url) is types.StringType:
host, selector = splithost(url)
if host:
user_passwd, host = splituser(host)
host = unquote(host)
realhost = host
###
The key point here is that open_http() is aware that if the url is a
tuple, it's HTTP proxy time. However, there is no analogous code in the
open_ftp() code that checks for 'type(url)', so open_ftp() doesn't take
this possibility into account!
Ugh. Have you tried urllib2? urllib2 looks like a rewrite of urllib, and
may be more considerate about proxy situations. The example in urllib2
shows an example that handles FTP proxying:
###
proxy_support = urllib2.ProxyHandler({"http" : "http://ahad-haam:3128"})
# build a new opener that adds authentication and caching FTP handlers
opener = urllib2.build_opener(proxy_support, authinfo,
urllib2.CacheFTPHandler)
# install it
urllib2.install_opener(opener)
###
You may also want to notify people on comp.lang.python about this problem
with urllib.py, as this does seem asthetically weird to me. Perhaps I'm
overreacting.