HTML Parser - beginner needs help
Bjorn Pettersen
bjorn at roguewave.com
Thu Sep 14 21:05:37 EDT 2000
In this case the HTMLParser module contains a handle_image method that
does exactly what you want (see below). As Frederik points out though,
it is in general easier to use sgmllib for extracting tags...
-- bjorn
import htmllib, formatter, urllib
class IMGParser(htmllib.HTMLParser):
def __init__(self):
htmllib.HTMLParser.__init__(self, formatter.NullFormatter())
self.images = []
def handle_image(self, src, alt, *args):
self.images.append(src)
parser = IMGParser()
parser.feed(urllib.urlopen("http://www.python.org").read())
parser.close()
print parser.images
zet wrote:
>
> Can somebody provide small piece of code, which returns list of img tags?
> I've trying this lines:
>
> class IMGParser(HTMLParser):
> def end_img(arg):
> return
>
> but it return only an anchors, how to get IMG's?
>
> --
> http://www.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list