TPG error when using 't' as the first letter of a token

Thu Nov 18 17:19:52 EST 2004

"Andrew James" <drew at gremlinhosting.com> wrote in message
news:mailman.6536.1100768465.5135.python-list at python.org...
> Gentlemen,
>
> I'm running into a problem whilst testing the parsing of a language I've
> created with TPG . It seems that for some reason, TPG balks when I try
> to parse an expression whose first letter is 't' (or, in fact, at any
> time when 't' is at the beginning of a token). This doesn't happen with
> any other letter (as far as I know), nor if the 'T' is capitalised.
>
> My grammar looks like this:
>
>     # Tokens
>     separator space '\s+';
>     token Num '\d+(.\d+)?';
>     token Ident '[a-zA-Z]\w*';
>     token CharList '\'.*\'';
>     token CatUnOp '~';
>     token CatOp '[/\^~]';
>     token MetaOp '[=\+\-!]';
>     token Date '\d\d-\d\d-\d\d\d\d';
>     token FileID '(\w+\.\w+)'
>     ;
>     # Rules
>     START -> CatExpr '\?' '[' MetaExpr ']'
>     | CatExpr
>     | FileID
>     ;
>     CatExpr -> CatUnOp CatName
>     | CatName (CatOp CatName)*
>     | CatName
>     ;
>     CatName -> Ident
>     #| '(' CatExpr ')'
>     ;
>     MetaExpr -> MetaCrit (',' MetaCrit)*
>     ;
>     MetaCrit -> Ident MetaOp Value
>     ;
>     Value -> CharList | Num | Date
>     ;
>
> My test script like this:
>
> if __name__ == '__main__':
>     """ For testing purposes only """
>     parseTests = ('This/is/a/simple/test', 'another/simple/test',
> "a/test/with/[author='drew']")
>     for line in parseTests:
>         try:
>             print "\nParsing: %s \n%s\n" % (line,"="*(len(line)+9))
>             qp = MFQueryParser()
>             print qp(line)
>         except Exception, inst:
>                 print "EXCEPTION: " + str(inst)
>
> <snip>
FYI, as a comparative data point, here is your parser implemented using
pyparsing.  I had to change your last test case because it didn't seem to
match your grammar.

-- Paul
(Download pyparsing at http://pyparsing.sourceforge.net.)

from pyparsing import alphas, nums, alphanums, Word, Optional, oneOf, Group,
\
      Literal, Combine, sglQuotedString, Forward, delimitedList, ZeroOrMore,
OneOrMore

integer = Word(nums)
num = Combine(integer + Optional("." + integer))
identChars = alphanums + "_$"
ident = Word(alphas, identChars)
charList = sglQuotedString
unop = Literal("~")
binop = oneOf("/ ^ ~")
metaOp = oneOf("= + - !")
date = Combine( Word(nums,exact=2) + "-" + Word(nums,exact=2) + "-" +
Word(nums,exact=4) )
fileId = Combine( Word(identChars) + "." + Word(identChars) )

value = charList | date | num
metaCrit = ident + metaOp + value
metaExpr = Group(delimitedList( metaCrit ))
expr = Forward()
name = ident | Group( "(" + expr + ")" )
expr << Group( ( unop + name ) | ( name + ZeroOrMore(binop + name) ) )

start = Group( expr + "?" + "[" + metaExpr + "]" ) | expr | fileId

parseTests = (
    'This/is/a/simple/test',
    'tanother/simple/test',
    "a/test/with?[author='drew']"
    )
for t in parseTests:
    print start.parseString(t)

Output:
=====
[['This', '/', 'is', '/', 'a', '/', 'simple', '/', 'test']]
[['tanother', '/', 'simple', '/', 'test']]
[[['a', '/', 'test', '/', 'with'], '?', '[', ['author', '=', "'drew'"],
']']]