urllib.request with proxy and HTTPS
Pavel Volkov
sailor at lists.xtsubasa.org
Fri Jun 30 03:53:17 EDT 2017
Hello,
I'm trying to make an HTTPS request with urllib.
OS: Gentoo
Python: 3.6.1
openssl: 1.0.2l
This is my test code:
===== CODE BLOCK BEGIN =====
import ssl
import urllib.request
from lxml import etree
PROXY = 'proxy.vpn.local:9999'
URL = "https://google.com"
proxy = urllib.request.ProxyHandler({'http': PROXY})
#context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_1)
context = ssl.SSLContext()
context.verify_mode = ssl.CERT_REQUIRED
context.check_hostname = True
secure_handler = urllib.request.HTTPSHandler(context = context)
opener = urllib.request.build_opener(proxy, secure_handler)
opener.addheaders = [('User-Agent', 'Mozilla/5.0 (Windows NT 6.1; Win64;
x64; rv:54.0) Gecko/20100101 Firefox/54.0')]
response = opener.open(URL)
tree = etree.parse(response, parser=etree.HTMLParser())
print(tree.docinfo.doctype)
===== CODE BLOCK END =====
My first problem is that CERTIFICATE_VERIFY_FAILED error happens.
I've found that something similar happens in macOS since Python installs
its own set of trusted CA.
But this isn't macOS and I can fetch HTTPS normally with curl and other
tools.
===== TRACE BLOCK BEGIN =====
Traceback (most recent call last):
File "/usr/lib64/python3.6/urllib/request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/usr/lib64/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib64/python3.6/http/client.py", line 964, in send
self.connect()
File "/usr/lib64/python3.6/http/client.py", line 1400, in connect
server_hostname=server_hostname)
File "/usr/lib64/python3.6/ssl.py", line 401, in wrap_socket
_context=self, _session=session)
File "/usr/lib64/python3.6/ssl.py", line 808, in __init__
self.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 1061, in do_handshake
self._sslobj.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 683, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed
(_ssl.c:749)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./https_test.py", line 21, in <module>
response = opener.open(URL)
File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/usr/lib64/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib64/python3.6/urllib/request.py", line 1320, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED]
certificate verify failed (_ssl.c:749)>
===== TRACE BLOCK END =====
Second problem is that for HTTP requests proxy is used, but for HTTPS it
makes a direct connection (verified with tcpdump).
I've read at docs.python.org that previous versions of Python couldn't
handle HTTPS with proxy but that shortcoming seems to have gone now.
Please help :)
More information about the Python-list
mailing list