Fast URL validation?
2002 at weholt.org
Tue Mar 11 13:13:24 CET 2003
I guess you could use httplib and just check the headers in the response
from the server, not read the entire document. Something like :
>>> import httplib
>>> conn = httplib.HTTPConnection("www.python.org")
>>> conn.request("GET", "/index.html")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
If r1.status is something other than 200 then the document is either
missing, server failure or your request will be redirected etc. If you
wanted to read the entire document you'd have to do this before closing the
>>> document = r1.read()
Or something similar. Don't know if this is the fastest way of doing it.
Please, post a better solution if you find one.
"Robert Oschler" <no_replies at fake_email_address.invalid> wrote in message
news:t%jba.11$_O5.47359 at news2.news.adelphia.net...
> Is there a way, with Python 2.2, to validate a URL without having to
> download the entire document? I want to rapidly (very rapidly!) scan a
> of URL's and make sure I don't get a "server not found" or "page not
> error. What external library would I need to import and and what
> Robert Oschler
> Android Technologies, Inc.
> The home of PowerSell! (tm)
> - "Power Tools for Amazon Associates" (sm)
More information about the Python-list