HTML Parser

Greg Jorgensen gregj at pobox.com
Sun Dec 31 00:02:22 EST 2000


"Greg Jorgensen" <gregj at pobox.com> wrote:

> rx = re.compile('(<.*?>)', re.MULTILINE)

Oops -- that should be:

    rx = re.compile('(<.*?>)', re.DOTALL)

That makes the . match the newlines. You need that because HTML tags can
span lines.

--
Greg Jorgensen
PDXperts
Portland, Oregon, USA
gregj at pobox.com





More information about the Python-list mailing list