[pypy-issue] [issue834] pypy.rlib.parsing gives unhelpful parse errors

Winston Ewert tracker at bugs.pypy.org
Thu Aug 18 20:43:25 CEST 2011

New submission from Winston Ewert <winstonewert at gmail.com>:

Steps to Reproduce:

Run the attached script through a python interpreter making sure it imports

Actual Output:

  File <unknown>, line 0
ParseError: expected EOF

Desired Output:

  File <unknown>, line 0
ParseError: expected "b"

The rules for this grammar:

program: pair* EOF;
pair: "b" | "a" "b";


The parser tries to match: "abaa" against [pair][pair], but the second [pair]
doesn't match. The parser then decides to accept a single [pair] as a valid
[pair*] and throws away the error that prevented pair from being matched. The
EOF at the end of program then fails to match producing the error seen above.


The simple solution is to also pass the error information along even if the node
parses correctly. Then the logic of always reporting the error that occoured
furthest into the list of tokens will take care of reporting the appropriate
error. I've done a quick hack job of this in my copy, and it works. However, a
much cleaner solution would remove the error information from the tuple and just
store it on the Table object. 


I'm happy to prepare a patch to fix this issue. However, I'd like to know from
the people responsible for this code whether my solution is acceptable and
whether there is anything I should be watching for. (I'm only beginning to
figure out what all the parsing stuff is doing.)

files: bug_case.py
messages: 2982
nosy: pypy-issue, winstonewert
priority: feature
status: unread
title: pypy.rlib.parsing gives unhelpful parse errors

PyPy bug tracker <tracker at bugs.pypy.org>

More information about the pypy-issue mailing list