how to scrape url out of href

Mike Meyer mwm at
Sun Jan 1 19:59:10 EST 2006

homepricemaps at writes:
> i need to scrape a url out of an href.  it seems that people recommend
> that i use beautiful soup but had some problems.

What problem are you having with BeautifulSoup? It's working fine for

> does anyone have sample code for scraping the actual url out of an href
> like this one
> <a href="" target="_blank">

The following fragment works fine for me:

        linktext = soup.fetchText('Next')
        if not linktext:
            return pages
            url = linktext[0].findParent('a')['href']

So you probably want something like:

   for anchor in soup.fetch('a', {'target': '_blank'}):
       print anchor['href']


Mike Meyer <mwm at>
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

More information about the Python-list mailing list