exception in urllib2

7stud bbxx789_05ss at yahoo.com
Sun Feb 1 06:20:26 EST 2009


On Feb 1, 3:34 am, asit <lipu... at gmail.com> wrote:
> I hv been developing a link scanner. Here the objective is to
> recursively scan a particular web site.
>
> During this, my script methttp://images.google.co.in/imghp?hl=en&tab=wi
> and passed it to the scan function, whose body is like this..
>
> def scan(site):
>     log=open(logfile,'a')
>     log.write(site + "\n")
>     site = "http://" + site.lower()
>     try:
>         site_data = urllib.urlopen(site)
>         parser = MyParser()
>         parser.parse(site_data.read())
>     except(IOError),msg:
>         print "Error in connecting site ", site
>         print msg
>     links = parser.get_hyperlinks()
>     for l in links:
>         log.write(l + "\n")
>
> But it throws a weird exception like this...
>
> Traceback (most recent call last):
>   File "I:\Python26\linkscan1.py", line 104, in <module>
>     main()
>   File "I:\Python26\linkscan1.py", line 95, in main
>     scan(lk)
>   File "I:\Python26\linkscan1.py", line 65, in scan
>     site_data = urllib.urlopen(site)
>   File "I:\Python26\lib\urllib.py", line 87, in urlopen
>     return opener.open(url)
>   File "I:\Python26\lib\urllib.py", line 203, in open
>     return getattr(self, name)(url)
>   File "I:\Python26\lib\urllib.py", line 327, in open_http
>     h = httplib.HTTP(host)
>   File "I:\Python26\lib\httplib.py", line 984, in __init__
>     self._setup(self._connection_class(host, port, strict))
>   File "I:\Python26\lib\httplib.py", line 656, in __init__
>     self._set_hostport(host, port)
>   File "I:\Python26\lib\httplib.py", line 668, in _set_hostport
>     raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
> httplib.InvalidURL: nonnumeric port: ''
>
> How can i handle this ???

"Handle" as in how do I catch that exception?  The exception's name is
give in the error message.  Look at the last line.  To catch that
exception, you can do this:

import httplib   #that's the module the exception lives in
                 #as indicated in error message

try:
    ....
    ....
    ....
except httplib.InvalidURL, e:
    print e
    print "Bad url"







More information about the Python-list mailing list