>I am trying to write an HTML parser. I am starting off with a simple
>one like so:
>input_file =
>jsp_content = newline.split(input_file)

Two things, neither of which answer your question (other have already
done that...):

First, you don't need to use re to split a file into lines. You could've
just said:

jsp_content = file.readlines()

(note that this, like your existing code, reads the entire file into
memory, which might not be a good idea if your file is huge)

Second, (this isn't Python related) you probably don't want to split
your file into lines in any case. HTML is *not* a line based language.
The following is a perfectly valid HTML tag:


Your code wouldn't work with such tags, since it works line-by-line.

