URL listers

P. Daniell pdaniell at ign.com
Mon Nov 17 08:21:19 CET 2003

I have the following HTML document

<a href="http://www.yahoo.com">I don't give a hoot</a>

I want my HTMLParser subclass (code below) to output

http://www.yahoo.com I don't give a hoot

Instead it outputs 

http://www.yahoo.com I don
http://www.yahoo.com  '
http://www.yahoo.com t give a hoot

Would anyone care to give me some guidance on how to fix this?


class URLLister(HTMLParser):
 def __init__(self):
  HTMLParser.__init__(self, formatter.NullFormatter())
  self.in_a = 0
  self.tempurl = ''
 def anchor_bgn(self, href, name, type):
  self.in_a = 1
  self.tempurl = href

 def anchor_end(self):
  self.in_a = 0
 def handle_data(self, data):
  if self.in_a == 1:
   print self.tempurl, data

More information about the Python-list mailing list