[Python-bugs-list] [ python-Bugs-225744 ] httplib does not check if port is valid (easy to fix?)

noreply@sourceforge.net noreply@sourceforge.net
Tue, 02 Jul 2002 13:50:33 -0700


Bugs item #225744, was opened at 2000-12-14 04:45
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=225744&group_id=5470

Category: Python Library
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Luca de Alfaro (dealfaro)
Assigned to: Jeremy Hylton (jhylton)
Summary: httplib does not check if port is valid (easy to fix?)

Initial Comment:
In httplib.py, line 336, the following code appears: 

    def _set_hostport(self, host, port):
        if port is None:
            i = string.find(host, ':')
            if i >= 0:
                port = int(host[i+1:])
                host = host[:i]
            else:
                port = self.default_port
        self.host = host
        self.port = port

Ths code breaks if the host string ends with ":", so that
int("") is called.  In the old (1.5.2) version of this 
module, the corresponding int () conversion used to be 
enclosed in a try/except pair: 

                try: port = string.atoi(port)
                except string.atoi_error:
                    raise socket.error, "nonnumeric port"

and this fixed the problem.  
Note BTW that now the error reported by int is 
"ValueError: invalid literal for int():"
rather than the above string.atoi_error. 

I found this problem while downloading web pages, 
but unfortunately I cannot pinpoint which page 
caused the problem. 

Luca de Alfaro

----------------------------------------------------------------------

>Comment By: Jeremy Hylton (jhylton)
Date: 2002-07-02 20:50

Message:
Logged In: YES 
user_id=31392

Skip's change was checked in a while ago.


----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2002-03-09 14:30

Message:
Logged In: YES 
user_id=44345

Here's a suggested patch that matches the exception
structure used by httplib.  It adds a subclass of 
HTTPException called InvalidURL and raises that when
int(port) gets incorrect input.


----------------------------------------------------------------------

Comment By: MartinThomas (martinthomas)
Date: 2001-01-10 21:19

Message:
I have been trying to pin down a problem in Redhat's
Update agent which is written in Python (..mostly)
which happens when a proxy is specified. 

In RH7.0, they are still using Python 1.5.2 and the message
'nonnumeric port' is received when a proxy is specified
in the following form:
http://proxy.yourdomain.com:80
but  the following:
proxy.yourdomain.com:80
works..
looking at the code, it seems that it expects that the only
colon would be near the end of the url and makes no
allowance for 'http:' nor 'https:'...

Regards / Martin

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-12-18 22:38

Message:
Thanks for explaining this more.

I am surprised that a 301 redirect would give an invalid port -- but surely webmasters aren't perfect. :-)

The argument that urllib.Urlopener.open() checks for socket.error but not for other errors is a good one.

However I don't see the httplib.py code raising socket.error elsewhere.  I'll ask Jeremy.  The rest of the module seems to be using a totally different set of exceptions.  On the other hand, it *can* raise socket.error, implicitly (when various socket calls are being made).



----------------------------------------------------------------------

Comment By: Luca de Alfaro (dealfaro)
Date: 2000-12-18 22:25

Message:
There are three (minor?) problems with raising
ValueError. 

1) Compatibility.  I had some code for 1.5.2 that
was trying to load web pages checking for various
errors, and it was expecting this error to cause
a socket error, not a value error. 

2) Accuracy.  ValueError can be caused by
anything.  The 'non-numeric port' error is much 
more informative.  I don't want to catch
ValueError, because it can be caused in too 
many situations.  I also cannot check 
myself that the port is fine, because the 
port and the URL are often given by a  
redirect (errors 301 and 302, if I remember
correctly).  This in fact was the situation 
that caused the problem. 
Hence, my only real solution was to patch my version of httplib. 

3) Style.  I am somewhat new to Python, but I was
under the impression that, stilistically, 
a ValueError was used to convey a situation that
was the fault of the programmer, while other 
more specific errors were used for unexpected 
situations (communication, etc).  Since the 
socket is the result of a URL redirection 
(errors 301 or 302), the programmer is not in 
a position to prevent this error by "better 
checking".  Hence, I would consider a
network-relted exception to be more appropriate 
here. 

But who am I to argue with the creator of Python? 
;-)

Luca


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-12-14 14:37

Message:
The only effect is that it raises ValueError instead of socket.error.
Where is this a problem?

(Note that string.atoi_error is an alias for ValueError.)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=225744&group_id=5470