[Tutor] extract hosts from html write to file
Kent Johnson
kent37 at tds.net
Wed Sep 12 00:29:43 CEST 2007
sacha rook wrote:
> Hi I wonder if anyone can help with the following
>
> I am trying to read a html page extract only fully qualified hostnames
> from the page and output these hostnames to a file on disk to be used
> later as input to another program.
I would use BeautifulSoup to parse out the hrefs and urlparse.urlparse()
to split the hostname out of the href.
http://www.crummy.com/software/BeautifulSoup/documentation.html
Kent
More information about the Tutor
mailing list