[Tutor] ASP parsing

Wed Feb 19 02:54:23 2003

On 2003.02.19 06:36 Danny Yoo wrote:
> ... >
> I see that the 'source' file is full of lines that may or may not
> contain the '<% [asp code here...] %>' ASP chunks.  For simplicity, 
> you may
> want to try the 're' regular expression module; it's especially 
> useful for
> breaking text into chunks for code.
> 
> For example:
> 
> ###
> >>> pattern = re.compile(r"""(<%          ### The leading ASP tag
> start
> ...                           .*?         ### followed by a bunch of
> ...                                       ### character, matching
> minimally,
> ...                           %>          ### with the ASP terminator.
> ...                          )""", re.DOTALL | re.VERBOSE)
> >>> pattern.split("""
> ... This is a test <% yes it is %>
> ... <% oh
> ...    this is
> ...    another test %>
> ... ok, I'm done""")
> ['\nThis is a test ',
>  '<% yes it is %>',
>  '\n',
>  '<% oh\n   this is\n    another test %>',
>  "\nok, I'm done"]
> ###
> 
> And now it should be much easier to process each list element, because
> all our parsing is done: each list element is either normal, or an ASP
> chunk.

The re module was my first approach to the problem, but I've chosen the 
other one because, as you can see, I rewrite the content of the file in 
a temporary one with all the modifications I've done: it seems to me 
that parsing just a line is much more simple than parsing all the text 
and then doing it another time for the quoted one, becaus I've got *a 
lot* of ASP lines, often with a really unclear structure (i.e. 
<HTML><%ASP%>html<%ASP%>html<%ASP%></HTML>).
Am I missing something ? Do you think that using file.readlines(), 
separating ASP to HTML, then parsing the ASP part and the HTML part is 
better then implementing a parser line by line ?
I'll follow you're suggestion for GUI code of course, thank you very 
much.

	Nicholas