BeautifulSoup

yamamoto blueskykind02 at gmail.com
Wed Jan 13 07:46:00 EST 2010


Hi,
I am new to Python. I'd like to extract "a" tag from a website by
using "beautifulsoup" module.
but it doesnt work!

//sample.py

from BeautifulSoup import BeautifulSoup as bs
import urllib
url="http://www.d-addicts.com/forum/torrents.php"
doc=urllib.urlopen(url).read()
soup=bs(doc)
result=soup.findAll("a")
for i in result:
    print i


Traceback (most recent call last):
  File "C:\Users\falcon\workspace\p\pyqt\ex1.py", line 8, in <module>
    soup=bs(doc)
  File "C:\Python26\lib\site-packages\BeautifulSoup.py", line 1499, in
__init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
  File "C:\Python26\lib\site-packages\BeautifulSoup.py", line 1230, in
__init__
    self._feed(isHTML=isHTML)
  File "C:\Python26\lib\site-packages\BeautifulSoup.py", line 1263, in
_feed
    self.builder.feed(markup)
  File "C:\Python26\lib\HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "C:\Python26\lib\HTMLParser.py", line 148, in goahead
    k = self.parse_starttag(i)
  File "C:\Python26\lib\HTMLParser.py", line 226, in parse_starttag
    endpos = self.check_for_whole_start_tag(i)
  File "C:\Python26\lib\HTMLParser.py", line 301, in
check_for_whole_start_tag
    self.error("malformed start tag")
  File "C:\Python26\lib\HTMLParser.py", line 115, in error
    raise HTMLParseError(message, self.getpos())
HTMLParser.HTMLParseError: malformed start tag, at line 276, column 36

any suggestion?
thanks in advance




More information about the Python-list mailing list