How do I tell "incomplete input" from "invalid input"?

Wed Jan 11 08:50:37 EST 2012

Hi,

I have been trying to figure out a reliable way to determine
incomplete Python script
input using Python C API. (Apology if it is OT here, I'm not sure where my post
belongs, perhaps to cplusplus-sig list.)

Apparently, most pointers lead to the Python FAQ [1] question:
How do I tell "incomplete input" from "invalid input"?

Unfortunately, this FAQ is either old or incomplete thus incorrect.

First, the proposed testcomplete() function uses internal symbols
which are not available to Python C API users. So, "whoever wrote that FAQ
should be given 20 lashes with a short piece of string" [2].

The second solution is incomplete or incorrect. It does not handle correctly
multi-line input longer than two lines with more flow control statements.
For example:

##########################################################################
>>> n = 10
>>> if n > 0:
...        if n < 100:
  File "<stdin>", line 2
    if n < 100:
              ^
IndentationError: expected an indented block
>>>
##########################################################################

or

##########################################################################
>>> for n in range(0, 5):
...     if n > 2:
  File "<stdin>", line 2
    if n > 2:
            ^
IndentationError: expected an indented block
>>>
##########################################################################

I have attached a slightly modified C++ version of the second program
from the FAQ question [1],
file faq_incomplete_input.cpp  which is also available from my GitHub repo [3]
In this program, I added several FIX comments with proposed corrections.
The idea is to additionally check for
PyErr_ExceptionMatches (PyExc_IndentationError)
and
strcmp (msg, "expected an indented block")
and
prompt is sys.ps2, means more code expected.

And, ignore errors until user confirms the input is finished,
so the whole input is eventually sent to the Py_CompileString
and then all exceptions are not ignored, but considered
as real result of compilation.

I simply wanted to achieve similar semantic to codeop._maybe_compile()
(called by codeop.compile_command) which performs some sort of dirty
hack in the following line:

if not code1 and repr(err1) == repr(err2):

So, the test in action for multi-line multi-statement input gives:

##########################################################################
>>> c = codeop.compile_command("for n in range(0, 3):", "test", "single")
err1 SyntaxError('unexpected EOF while parsing', ('test', 1, 22, 'for
n in range(0, 3):\n'))
err2 IndentationError('expected an indented block', ('test', 2, 1, '\n'))
comparison.err1 SyntaxError('unexpected EOF while parsing', ('test',
1, 22, 'for n in range(0, 3):\n'))
comparison.err2 IndentationError('expected an indented block',
('test', 2, 1, '\n'))
code None
code1 None
>>> c = codeop.compile_command("for n in range(0, 3):\n\tif n > 0:", "test", "single")
err1 IndentationError('expected an indented block', ('test', 2, 11,
'\tif n > 0:\n'))
err2 IndentationError('expected an indented block', ('test', 3, 1, '\n'))
comparison.err1 IndentationError('expected an indented block',
('test', 2, 11, '\tif n > 0:\n'))
comparison.err2 IndentationError('expected an indented block',
('test', 3, 1, '\n'))
code None
code1 None
>>>
##########################################################################

So, I reckon it make sense to use the same logic to when calling
Py_CompileString.

Does it sound as reasonable solution?

Basically, there seem to be no canonical solution anywhere presented
on how to perform incomplete input tests in reliable manner, how to perform
parsing/compilation in subsequent steps against Python code given line-by-line.

The C API used by Python function compile() is not publicly available.

There is PyRun_InteractiveLoop mechanism but it is tightly coupled to
FILE-based I/O which is not always available when Python is embedded,
so the loop is useless in number of situations.

Have I overlooked any other obvious solution?

Finally, it would be helpful if the Python FAQ is up to date.

[1] http://docs.python.org/py3k/faq/extending.html#how-do-i-tell-incomplete-input-from-invalid-input
[2] http://mail.python.org/pipermail/python-list/2004-August/887195.html
[3] https://github.com/mloskot/workshop/blob/master/python/

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: faq_incomplete_input.cpp
Type: text/x-c++src
Size: 4938 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20120111/7391c55a/attachment.cpp>