Newby: How do I strip HTML tags?

Gerson Kurz gerson.kurz at t-online.de
Fri Jun 7 12:40:27 EDT 2002


This is a quite straight forward function: 

def StripTags(text):
    finished = 0
    while not finished:
        finished = 1
        # check if there is an open tag left
        start = text.find("<")
        if start >= 0:
            # if there is, check if the tag gets closed
            stop = text[start:].find(">")
            if stop >= 0:
                # if it does, strip it, and continue loop
                text = text[:start] + text[start+stop+1:]
                finished = 0
    return text

Or, if you feel lucky, you could use this more complicated solution

def StripTags(text):
    flag = [1]
    def stripfunc(c):
        if not flag[0]:
            if c == '>':
                flag[0] = 1
                return 0
        elif c == '<':
            flag[0] = 0
        return flag[0]               
    return filter(stripfunc,text)





More information about the Python-list mailing list