better regular expression?

Robert Brewer fumanchu at
Tue Dec 7 02:04:02 CET 2004

Vivek wrote:
> I am trying to construct a regular expression using the re module that
> matches for
> 1. my hostname
> 2. absolute from the root URLs including just "/"
> 3. relative URLs.
> Basically I want the attern to not match for URLs that are not on my
> host.

Far easier would be grabbing the URL's and then using
urlparse.urlparse() on them. Relative paths should be combined with the
base scheme://location/path. When you want to see if they are on your
host, just use .startswith(). If you're worried about ../, make the
paths concrete (os paths) and call os.path.normpath before comparing

Robert Brewer
Amor Ministries
fumanchu at

More information about the Python-list mailing list