[issue13281] Make robotparser.RobotFileParser ignore blank lines
report at bugs.python.org
Sat Nov 12 12:51:38 CET 2011
Éric Araujo <merwok at netwok.org> added the comment:
First, I’d like to remind that the robots spec is not an official Internet spec backed up by an official body. It’s also not as important as (say) HTTP parsing.
For this bug, IMO the guiding principle should be Postel’s Law. What harm is there in being more lenient than the spec? People apparently want to parse the robots.txt with blank lines from last.fm and whitehouse.gov, and I don’t think there are people that depend on the fact that blank lines cause the rest of the file to be ignored. Hence, I think too that we should be pragmatic and allow blank lines, to follow the precedent established by other tools and be pragmatic.
If you feel strongly about this, I can contact the robotstxt.org people.
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list