Python program that validates an url against w3c markup validator
Fredrik Lundh
fredrik at pythonware.com
Wed Nov 29 01:58:50 EST 2006
yaru22 wrote:
> I'd like to create a program that validates bunch of urls against the
> w3c markup validator (http://validator.w3.org/) and store the result in
> a file.
>
> Since I don't know network programming, I have no idea how to start
> coding this program.
>
> I was looking at the python library and thought urllib or urllib2 may
> be used to make this program work.
>
> But I don't know how to send my urls to the w3c validator and get the
> result.
this should get you going, I think:
>>> import urllib
>>> uri = "http://www.python.org"
>>> f = urllib.urlopen("http://validator.w3.org/check?uri=" + uri)
>>> print f.headers
Date: Wed, 29 Nov 2006 06:52:33 GMT
Server: Apache/2.0.54 (Debian GNU/Linux) mod_perl/1.999.21 Perl/v5.8.4
Content-Language: en
X-W3C-Validator-Recursion: 1
X-W3C-Validator-Status: Valid
X-W3C-Validator-Errors: 0
Connection: close
Content-Type: text/html; charset=utf-8
>>> print f.headers["x-w3c-validator-status")
Valid
>>> uri = "http://www.cnn.com"
>>> f = urllib.urlopen("http://validator.w3.org/check?uri=" + uri)
>>> print f.headers["x-w3c-validator-status"]
Invalid
>>> print f.headers["x-w3c-validator-errors"]
39
</F>
More information about the Python-list
mailing list