[Tutor] ASP parsing
Danny Yoo
dyoo@hkn.eecs.berkeley.edu
Wed Feb 19 00:37:01 2003
On Tue, 18 Feb 2003, Nicholas Wieland wrote:
> does someone know of a simple way of parsing a scripting
> languaguage like ASP in Python ? I'm having a lot of troubles because
> it's embedded in HTML text, so I need to parse a text file line by line
> switching from HTML to ASP if I encounter a '<%'.
Hi Nicholas,
I took a look at the code; I see that there's a bit of wxPython graphical
inteface code intermingled with the ASP parsing. You might want to
separate the two from each other, so that it's easier to test the parsing
code without having to invoke the GUI interface.
I see that the 'source' file is full of lines that may or may not contain
the '<% [asp code here...] %>' ASP chunks. For simplicity, you may want
to try the 're' regular expression module; it's especially useful for
breaking text into chunks for code.
For example:
###
>>> pattern = re.compile(r"""(<% ### The leading ASP tag start
... .*? ### followed by a bunch of
... ### character, matching
minimally,
... %> ### with the ASP terminator.
... )""", re.DOTALL | re.VERBOSE)
>>> pattern.split("""
... This is a test <% yes it is %>
... <% oh
... this is
... another test %>
... ok, I'm done""")
['\nThis is a test ',
'<% yes it is %>',
'\n',
'<% oh\n this is\n another test %>',
"\nok, I'm done"]
###
And now it should be much easier to process each list element, because all
our parsing is done: each list element is either normal, or an ASP chunk.
If you're curious to learn more about regular expressions, you may find
A.M. Kuchling's tutorial on regular expressions useful:
http://www.amk.ca/python/howto/regex/
This process of "chunking" our data is also known as "lexing". For lexing
a piece of text, regular expressions can be powerful and versatile tools.
Hope this helps!