url validator in python
Tim Chase
python.list at tim.thechases.com
Wed Mar 19 12:21:19 EDT 2008
> How can I check the validity of absolute urls with http scheme?
> example:
> "http://www.example.com/something.html" -> valid
> "http://www.google.com/ + Brite_AB_Iframe_URL + " -> invalid
You could try something like
import urllib
tests = (
("http://www.google.com/ + Brite_AB_Iframe_URL + ", False),
("http://www.example.com/something.html", True),
("https://www.google.com/ + Brite_AB_Iframe_URL + ", False),
("https://www.example.com/something.html", True),
)
def no_method(url):
if ':' in url[:7]:
# strip off the leading http:
return url.split(':', 1)[1]
return url
def is_valid_url(url):
url = no_method(url)
return url == urllib.quote(url)
for test_url, expected_result in tests:
print "Testing %s\nagainst %s" % (
no_method(test_url),
urllib.quote(no_method(test_url))
)
actual_result = is_valid_url(test_url)
print 'Pass: %s' % (actual_result == expected_result)
print '='*70
The reason for the no_method() is that otherwise it gets
normalized to "http%3A//..." so you have to strip off that bit
before comparing.
-tkc
More information about the Python-list
mailing list