urllib2 and proxies support ?

Wed Jan 4 14:42:20 EST 2006

Hello all,

I've a problem using urllib2 with a proxy which need authentication.

I've tested the 'simple way' :

-- code --
import urllib

# example values for the post
my_url = 'http://www.python.org'
proxy_info = { 'host' : 'netcache.monentreprise.com',
               'port' : 3128,
               'user' : 'gaston.lagaffe',
               'pass' : 'jeanne55'
proxy_support = urllib2.ProxyHandler({"http" :
"http://%(user)s:%(pass)s@%(host)s:%(port)d" % proxy_info})
opener = urllib2.build_opener(proxy_support)

# print proxies
print "Proxies", urllib2.getproxies()
# always print "Proxies {}" but I've set another proxy ! :-(

req = urllib2.Request(url = my_url)
handle = urllib2.urlopen(req) # get an error, seems like proxy is not
-- code --

But this doesn't work. It seems that the proxy is not recognized by
urllib2. I've read a previoust post [1] by Ray Slakinski with John Lee
answer, but unfortunatly it seems that this problem in urllib2 is well

So my questions are :
- Is there a way to make this work, and if yes, to make it work for
another user which hasn't a custom (patched) urllib2 ?
- What are your advices to use working web crawling with Python (I mean
a good support for proxies including authenticated ones, cookies, ...)
: mechanize[2] , pyCurl [3], others ?

[1] posted the 8 nov 2005 on comp.lang.python, title "urllib2 Opener
and Proxy/Authentication issues"
[2] http://wwwsearch.sourceforge.net/mechanize/
[3] http://pycurl.sourceforge.net/

Thank you
(and happy new year 2006!)

