[Python-bugs-list] [ python-Bugs-482171 ] webchecker dies on file: URLs w/o robots

noreply@sourceforge.net noreply@sourceforge.net
Wed, 10 Apr 2002 09:16:57 -0700


Bugs item #482171, was opened at 2001-11-15 11:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=482171&group_id=5470

Category: Demos and Tools
Group: None
Status: Open
Resolution: Fixed
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Guido van Rossum (gvanrossum)
Summary: webchecker dies on file: URLs w/o robots

Initial Comment:
When webchecker is run on local documents using a file:
URL, it dies if there is not a robots.txt in the root
directory:

cj42289-a(.../python/Doc); python
../Tools/webchecker/webchecker.py file:`pwd`/html/api/
webchecker version 1.24
Traceback (most recent call last):
  File "../Tools/webchecker/webchecker.py", line 858, in ?
    main()
  File "../Tools/webchecker/webchecker.py", line 205,
in main
    c.addroot(arg)
  File "../Tools/webchecker/webchecker.py", line 324,
in addroot
    self.addrobot(root)
  File "../Tools/webchecker/webchecker.py", line 337,
in addrobot
    rp.read()
  File "/usr/local/lib/python2.2/robotparser.py", line
43, in read
    f = opener.open(self.url)
  File "/usr/local/lib/python2.2/urllib.py", line 178,
in open
    return getattr(self, name)(url)
  File "/usr/local/lib/python2.2/urllib.py", line 405,
in open_file
    return self.open_local_file(url)
  File "/usr/local/lib/python2.2/urllib.py", line 412,
in open_local_file
    stats = os.stat(localname)
OSError: [Errno 2] No such file or directory: '/robots.txt'


----------------------------------------------------------------------

Comment By: Jacques A. Vidrine (nectar)
Date: 2002-04-10 11:16

Message:
Logged In: YES 
user_id=14672

I should have been more clear.

The code in question in webchecker.py expected (correctly)
to handle IOError exceptions from urlopen.  However, due to
bug 541980, urlopen (incorrectly) can raise an OSError
exception when handling 'file:' URLs.

The fix made in rev 1.25 of webchecker.py was to catch both
the IOError and OSError exceptions.  This fix is incorrect,
because urlopen should not be raising OSError for this type
of condition (file not found).


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-10 10:43

Message:
Logged In: YES 
user_id=6380

Maybe I'm dense, but I don't see the connection.

Can you elaborate?  I've reopened the bug report for your 
convenience.

----------------------------------------------------------------------

Comment By: Jacques A. Vidrine (nectar)
Date: 2002-04-10 09:00

Message:
Logged In: YES 
user_id=14672

Please see bug ID 541980.  I believe the fix here
(webchecker.py rev 1.25) was incorrect.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-11 16:41

Message:
Logged In: YES 
user_id=6380

Fixed in webchecker.py rev. 1.25.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=482171&group_id=5470