[New-bugs-announce] [issue35457] robotparser reads empty robots.txt file as "all denied"

larsfuse report at bugs.python.org
Tue Dec 11 04:30:47 EST 2018

New submission from larsfuse <lars at cl.no>:

The standard (http://www.robotstxt.org/robotstxt.html) says:

> To allow all robots complete access:
> User-agent: *
> Disallow:
> (or just create an empty "/robots.txt" file, or don't use one at all)

Here I give python an empty file:
$ curl


rp = robotparser.RobotFileParser()
print (robotsurl)
print( "fetch /", rp.can_fetch(useragent = "*", url = "/"))
print( "fetch /admin", rp.can_fetch(useragent = "*", url = "/admin"))


$ ./test.py
('fetch /', False)
('fetch /admin', False)

And the result is, robotparser thinks the site is blocked.

components: Library (Lib)
messages: 331595
nosy: larsfuse
priority: normal
severity: normal
status: open
title: robotparser reads empty robots.txt file as "all denied"
type: behavior
versions: Python 2.7

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list