[New-bugs-announce] [issue16099] robotparser doesn't support request rate and crawl delay parameters

Nikolay Bogoychev report at bugs.python.org
Mon Oct 1 14:58:25 CEST 2012


New submission from Nikolay Bogoychev:

Robotparser doesn't support two quite important optional parameters from the robots.txt file. I have implemented those in the following way:
(Robotparser should be initialized in the usual way:
rp = robotparser.RobotFileParser()
rp.set_url(..)
rp.read
)

crawl_delay(useragent) - Returns time in seconds that you need to wait for crawling
if none is specified, or doesn't apply to this user agent, returns -1
request_rate(useragent) - Returns a list in the form [request,seconds].
if none is specified, or doesn't apply to this user agent, returns -1

----------
components: Library (Lib)
files: robotparser.patch
keywords: patch
messages: 171711
nosy: XapaJIaMnu
priority: normal
severity: normal
status: open
title: robotparser doesn't support request rate and crawl delay parameters
type: enhancement
versions: Python 2.7
Added file: http://bugs.python.org/file27373/robotparser.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue16099>
_______________________________________


More information about the New-bugs-announce mailing list