[Python-bugs-list] [ python-Bugs-494762 ] urllib2 on python2.2 ssl bug

noreply@sourceforge.net noreply@sourceforge.net
Fri, 28 Dec 2001 17:18:58 -0800


Bugs item #494762, was opened at 2001-12-18 13:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=494762&group_id=5470

Category: Python Library
Group: Python 2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Marcus Felipe Pereira (makim)
Assigned to: Jeremy Hylton (jhylton)
Summary: urllib2 on python2.2 ssl bug

Initial Comment:
urllib2 on python 2.2 can´t get some SSL pages.

It seams that it´s dependent of the server and the 
issuer of the key.

The server showed below (https://wwws.task.com.br) 
uses IIS 5.0 and 128 bits key issued by Thawte.

I´ve tested on python 2.1 and it's OK.

******** Code *************
import os,urllib2
os.environ["http_proxy"]=''
f = urllib2.urlopen("https://wwws.task.com.br/i.htm")
print f.read()


******** Output ************
Traceback (most recent call last):
  File "./httpstest", line 6, in ?
    f = urllib2.urlopen
("https://wwws.task.com.br/i.htm")
  File "/usr/lib/python2.2/urllib2.py", line 138, in 
urlopen
    return _opener.open(url, data)
  File "/usr/lib/python2.2/urllib2.py", line 322, in 
open
    '_open', req)
  File "/usr/lib/python2.2/urllib2.py", line 301, in 
_call_chain
    result = func(*args)
  File "/usr/lib/python2.2/urllib2.py", line 792, in 
https_open
    return self.do_open(httplib.HTTPS, req)
  File "/usr/lib/python2.2/urllib2.py", line 774, in 
do_open
    code, msg, hdrs = h.getreply()
  File "/usr/lib/python2.2/httplib.py", line 728, in 
getreply
    response = self._conn.getresponse()
  File "/usr/lib/python2.2/httplib.py", line 572, in 
getresponse
    response = self.response_class(self.sock)
  File "/usr/lib/python2.2/httplib.py", line 98, in 
__init__
    self.fp = sock.makefile('rb', 0)
  File "/usr/lib/python2.2/httplib.py", line 607, in 
makefile
    buf = self.__ssl.read()
socket.sslerror: (5, 'EOF occurred in violation of 
protocol')

*****************************************


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-12-28 17:18

Message:
Logged In: YES 
user_id=21627

The problem does not lie in the urllib module, and likely
also not in the SSL support in the socket module. Please
refer to the attached https.py.

On a server that does an orderly SSL shutdown (e.g. sf.net),
this raises
socket.sslerror: (6, 'TLS/SSL connection has been closed')
or socket.SSL_ERROR_ZERO_RETURN. On wwws.task.com.br, it prints

HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Sat, 29 Dec 2001 00:50:32 GMT
Content-Type: text/html
Accept-Ranges: by
tes
Last-Modified: Tue, 18 Dec 2001 21:17:02 GMT
ETag: "05be155988c11:85f"
Content-Length: 80


<HTML>
<HEAD><TITLE>HTTPS Test</TITLE></HEAD>
<BODY>HTTPS Test</BODY>
</HTML>
Traceback (most recent call last):
  File "https.py", line 16, in ?
    buf = ssl.read()
socket.sslerror: (5, 'EOF occurred in violation of protocol')
So I still think that the bug is on the server side
(Microsoft IIS, in this case), which does not perform proper
connection shutdown, but just closes the connection.

This problem went unnoticed in 2.1, since
httplib.FakeSocket.makefile would read until any kind of
exception occurred, then consider the exception as the end
of the conversation. This was bug #458835; Jeremy fixed it
in httplib.py 1.41.
I don't think we should restore the 2.1 behaviour. In the
specific case of IIS, the best thing would be to honor the
Content-length, i.e. not try to read more than
content-length bytes; that would require implementing a true
file-like object, instead of re-using StringIO.

The best work-around (for this case, and the general case of
a server violating the SSL protocol) is to special-case
socket.SSL_ERROR_SYSCALL in addition to
SSL_ERROR_ZERO_RETURN, perhaps checking for the message
""EOF occurred in violation of protocol" (since this message
is generated inside Python).

In summary, I agree with Jeremy that these changes were for
the better...

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-28 14:37

Message:
Logged In: YES 
user_id=6380

BTW it's not specific to urllib2, regular old urllib has the
same problem on 2.2 but not on 2.1.1.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-28 14:37

Message:
Logged In: YES 
user_id=6380

Hm, I do get the same outcome: Python 2.1.1 gives a valid
result, while Python 2.2 gives socket.sslerror: (5, 'EOF
occurred in violation of protocol').

There have been a few changes in the SSL support in 2.2. I'm
assigning this to Jeremy Hylton, who made some of those
changes thinking they were for the better. :-)

----------------------------------------------------------------------

Comment By: Marcus Felipe Pereira (makim)
Date: 2001-12-19 14:40

Message:
Logged In: YES 
user_id=405476

Strange is that the same code works in python 2.1 on the 
same machine.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-12-19 12:38

Message:
Logged In: YES 
user_id=21627

If OpenSSL says the server violates the protocol, I'm pretty
sure OpenSSL is right. So I fail to see the problem in Python.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=494762&group_id=5470