[Python-bugs-list] [ python-Bugs-232000 ] New robotparser fails for non-HTTP schemes

nobody nobody@sourceforge.net
Mon, 26 Feb 2001 09:01:43 -0800


Artifact #232000, was updated on 2001-02-12 07:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=232000&group_id=5470

Category: Python Library
Group: None
Status: Closed
Priority: 7
Submitted By: Fred L. Drake, Jr.
Assigned to: Skip Montanaro
Summary: New robotparser fails for non-HTTP schemes

Initial Comment:
The new robotparser module fails for non-HTTP URLs where the old one did not.  In particular, file: URLs cause an exception to be raised (socket.error: (111, 'Connection refused')) where the old robotparser did not fail.

This is due, at least in part, by the current code using httplib directly rather than using urllib for flexibility.  The code should be changed accordingly.

A good test case for this is running webchecker on a local tree of HTML files.  I currently get the exception:

cj42289-a(.../Doc/html); python ../../Tools/webchecker/webchecker.py -x file://`pwd`/api/
webchecker version 1.22
Traceback (most recent call last):
  File "../../Tools/webchecker/webchecker.py", line 824, in ?
    main()
  File "../../Tools/webchecker/webchecker.py", line 205, in main
    c.addroot(arg)
  File "../../Tools/webchecker/webchecker.py", line 324, in addroot
    self.addrobot(root)
  File "../../Tools/webchecker/webchecker.py", line 337, in addrobot
    rp.read()
  File "/usr/local/lib/python2.1/robotparser.py", line 46, in read
    connection.putrequest("GET", self.path)
  File "/usr/local/lib/python2.1/httplib.py", line 426, in putrequest
    self.send(str)
  File "/usr/local/lib/python2.1/httplib.py", line 368, in send
    self.connect()
  File "/usr/local/lib/python2.1/httplib.py", line 352, in connect
    self.sock.connect((self.host, self.port))
socket.error: (111, 'Connection refused')

Assigned to Skip since he's the robots.txt guru.

----------------------------------------------------------------------

Comment By: Skip Montanaro
Date: 2001-02-26 09:01

Message:
Logged In: YES 
user_id=44345

Yes, my apologies.  Fixed by version 1.9.


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr.
Date: 2001-02-26 06:38

Message:
Logged In: YES 
user_id=3066

Skip, is this fixed now?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=232000&group_id=5470