TPG error when using 't' as the first letter of a token
Paul McGuire
ptmcg at austin.rr._bogus_.com
Thu Nov 18 17:19:52 EST 2004
"Andrew James" <drew at gremlinhosting.com> wrote in message
news:mailman.6536.1100768465.5135.python-list at python.org...
> Gentlemen,
>
> I'm running into a problem whilst testing the parsing of a language I've
> created with TPG . It seems that for some reason, TPG balks when I try
> to parse an expression whose first letter is 't' (or, in fact, at any
> time when 't' is at the beginning of a token). This doesn't happen with
> any other letter (as far as I know), nor if the 'T' is capitalised.
>
> My grammar looks like this:
>
> # Tokens
> separator space '\s+';
> token Num '\d+(.\d+)?';
> token Ident '[a-zA-Z]\w*';
> token CharList '\'.*\'';
> token CatUnOp '~';
> token CatOp '[/\^~]';
> token MetaOp '[=\+\-!]';
> token Date '\d\d-\d\d-\d\d\d\d';
> token FileID '(\w+\.\w+)'
> ;
> # Rules
> START -> CatExpr '\?' '[' MetaExpr ']'
> | CatExpr
> | FileID
> ;
> CatExpr -> CatUnOp CatName
> | CatName (CatOp CatName)*
> | CatName
> ;
> CatName -> Ident
> #| '(' CatExpr ')'
> ;
> MetaExpr -> MetaCrit (',' MetaCrit)*
> ;
> MetaCrit -> Ident MetaOp Value
> ;
> Value -> CharList | Num | Date
> ;
>
> My test script like this:
>
> if __name__ == '__main__':
> """ For testing purposes only """
> parseTests = ('This/is/a/simple/test', 'another/simple/test',
> "a/test/with/[author='drew']")
> for line in parseTests:
> try:
> print "\nParsing: %s \n%s\n" % (line,"="*(len(line)+9))
> qp = MFQueryParser()
> print qp(line)
> except Exception, inst:
> print "EXCEPTION: " + str(inst)
>
> <snip>
FYI, as a comparative data point, here is your parser implemented using
pyparsing. I had to change your last test case because it didn't seem to
match your grammar.
-- Paul
(Download pyparsing at http://pyparsing.sourceforge.net.)
from pyparsing import alphas, nums, alphanums, Word, Optional, oneOf, Group,
\
Literal, Combine, sglQuotedString, Forward, delimitedList, ZeroOrMore,
OneOrMore
integer = Word(nums)
num = Combine(integer + Optional("." + integer))
identChars = alphanums + "_$"
ident = Word(alphas, identChars)
charList = sglQuotedString
unop = Literal("~")
binop = oneOf("/ ^ ~")
metaOp = oneOf("= + - !")
date = Combine( Word(nums,exact=2) + "-" + Word(nums,exact=2) + "-" +
Word(nums,exact=4) )
fileId = Combine( Word(identChars) + "." + Word(identChars) )
value = charList | date | num
metaCrit = ident + metaOp + value
metaExpr = Group(delimitedList( metaCrit ))
expr = Forward()
name = ident | Group( "(" + expr + ")" )
expr << Group( ( unop + name ) | ( name + ZeroOrMore(binop + name) ) )
start = Group( expr + "?" + "[" + metaExpr + "]" ) | expr | fileId
parseTests = (
'This/is/a/simple/test',
'tanother/simple/test',
"a/test/with?[author='drew']"
)
for t in parseTests:
print start.parseString(t)
Output:
=====
[['This', '/', 'is', '/', 'a', '/', 'simple', '/', 'test']]
[['tanother', '/', 'simple', '/', 'test']]
[[['a', '/', 'test', '/', 'with'], '?', '[', ['author', '=', "'drew'"],
']']]
More information about the Python-list
mailing list