Getting HTTP responses - a python linkchecking script.

blair.bethwaite at gmail.com blair.bethwaite at gmail.com
Mon May 8 04:50:28 CEST 2006


Hi Folks,

I'm thinking about writing a script that can be run over a whole site
and produce a report about broken links etc...

I've been playing with the urllib2 and httplib modules as a starting
point and have found that with urllib2 it doesn't seem possible to get
HTTP status codes.

I've had more success with httplib...
Firstly I create a new HTTPConnection object with a given hostname and
port then I try connecting to the host and catch any socket errors
which I can assume mean the server is either down or doesn't exist at
this place anymore.
If the connection was successful I try requesting the resource in
question, I then get the response and check the status code.

So, I've got the tools I need to do the job sufficiently.  Just
wondering whether anybody can recommend any alternatives.

Cheers,
   -Blair




More information about the Python-list mailing list