about parse a link
Fredrik Lundh
fredrik at pythonware.com
Fri Sep 6 02:29:59 EDT 2002
"koko" wrote:
> if I have extracted the links on the page:
> e.g:
>
> http://www.uic.edu/index.htm
> on this page: there are
> a.htm
> b.htm
> c.htm
> http://www.uic.edu/home/e.htm
>
> how can I log the a.htm, b.htm, c.htm with the full web address?
base = "http://www.uic.edu/index.htm"
url_list = [
"a.htm",
"b.htm",
"c.htm",
"http://www.uic.edu/home/e.htm"
]
import urlparse
for url in url_list:
print urlparse.urljoin(base, url)
prints
http://www.uic.edu/a.htm
http://www.uic.edu/b.htm
http://www.uic.edu/c.htm
http://www.uic.edu/home/e.htm
</F>
<!-- (the eff-bot guide to) the python standard library:
http://www.pythonware.com/people/fredrik/librarybook.htm
-->
More information about the Python-list
mailing list