HTML Parser - beginner needs help

Fredrik Lundh effbot at
Thu Sep 14 21:40:29 CEST 2000

"zet" wrote:
> Can somebody provide small piece of code, which returns list of  img tags?
> I've trying this lines:
> class IMGParser(HTMLParser):
>  def end_img(arg):
>   return

if you're looking for tags, sgmllib is usually easier to use.
here's an example:

# extract image tags
# (based on from the eff-bot guide)

import sgmllib

class ImageParser(sgmllib.SGMLParser):

    def __init__(self, verbose=0):
        sgmllib.SGMLParser.__init__(self, verbose)
        self.images = []

    def do_img(self, attrs):
        for k, v in attrs:
            if k == "src":

def extract(file):
    # get img tags from an HTML/SGML stream
    p = ImageParser()
    while 1:
        s =
        if not s:
    return p.images

# try it out

import urllib

print extract(urllib.urlopen(""))

## prints:
## ['./pics/PyBanner011.gif',
##  './pics/PythonPoweredSmall.gif',
##  'pics/pythonHi.gif']


<!-- (the eff-bot guide to) the standard python library:

More information about the Python-list mailing list