[Patches] [Patch #102229] a better robotparser.py module

noreply@sourceforge.net noreply@sourceforge.net
Thu, 04 Jan 2001 09:43:35 -0800


Patch #102229 has been updated. 

Project: python
Category: Modules
Status: Open
Submitted by: calvin
Assigned to : gvanrossum
Summary: a better robotparser.py module

Follow-Ups:

Date: 2000-Nov-02 09:40
By: calvin

Comment:
I have written a new RobotParser module 'robotparser2.py'.

This module is

o backward compatible with the old one

o makes correct useragent matching (is buggy in
  robotparser.py)

o strips comments correctly (is buggy in robotparser.py)

o uses httplib instead of urllib.urlopen() to catch HTTP
  connect errors correctly (is buggy in robotparser.py)
  
o implements not only the draft at

http://info.webcrawler.com/mak/projects/robots/norobots.html
  but also the new one at
  http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html


Bastian Kleineidam

-------------------------------------------------------

Date: 2000-Nov-02 11:14
By: gvanrossum

Comment:
Skip, can you comment on this?  
-------------------------------------------------------

-------------------------------------------------------
For more info, visit:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102229&group_id=5470