[Baypiggies] Mini languages
Dennis Reinhardt
DennisR at dair.com
Sun May 7 21:15:36 CEST 2006
At 11:38 AM 5/7/2006, Ken Seehart wrote:
>This means I have to know when I reach the closing brace (which I can't do
>with regular expressions). However, I'm sure I could do a prototype this
>way, using the assumption that the a closing brace on a class matches
>"^};", but that would be just plain sloppy :-)
I don't know your syntax but it sounds like you (1) know when to expect
braces. I am further guessing that you have (2) a single level of
braces. The routine below would work under these assumptions. I have
implemented a self-compiler prior to working with Python which met those
assumptions. If the assumptions could not be met, I would be inclined to
use LALR but I have not direct experience. Rather, I designed the syntax
to not require LALR. I have had some success in parsing under Python using
RE with the following code:
import sre, string
#separate html into 5 components based on regex case insensitive, dotall.
# Input regex should use (?:...) grouping, if any
# regex may not be compiled since we compile it here
def regex_sep(str, regex1, regex2):
left = lm = mid = rm = right = "" # return matched regex
flags = "(?is)"
re1 = sre.compile("(%s)%s" % (regex1, flags))
match1 = re1.search(str)
if match1:
lm = match1.group(1)
left, rest = split2(str, lm)
re2 = sre.compile("(%s)%s" % (regex2, flags))
match2 = re2.search(rest)
if match2:
rm = match2.group(1)
mid, right = split2(rest, rm)
else:
mid = rest
else:
left = str
return left, lm, mid, rm, right
def split2(str, pattern):
left = str
right = ""
try:
splitlen = len(string.split(str, pattern, 1))
if splitlen == 2:
left, right = string.split(str, pattern, 1)
except:
pass
return (left, right)
The call
x1,x2,x3,x4,x5 = regex_sep(input_str, "{", "}")
would separate input_str into 5 components
x1 = text prior to first regex match
= input_str if no match and others ""
x2 = text which matched the first regex (trivially "{" here)
x3 = text between matched regex
x4 = text which matched second regex (trivially "}" here)
x5 = text following second regex match
Regards, Dennis
----------------------------------
| Dennis | DennisR at dair.com |
| Reinhardt | Powerful Anti-Spam |
----------------------------------
More information about the Baypiggies
mailing list