robotparser behavior on 403 (Forbidden) robot.txt files

John Nagle nagle at
Mon Jun 2 18:40:52 CEST 2008

   I just discovered that the "robotparser" module interprets
a 403 ("Forbidden") status on a "robots.txt" file as meaning
"all access disallowed". That's unexpected behavior.

   A major site ("") has their
"robots.txt" file set up that way.

   There's no real "robots.txt" standard, unfortunately.
So it's not definitively a bug.

				John Nagle

