[ python-Bugs-984714 ] unknown parsing error

SourceForge.net noreply at sourceforge.net
Wed Jul 21 07:36:40 CEST 2004


Bugs item #984714, was opened at 2004-07-03 21:39
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=984714&group_id=5470

Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Igor Sidorenkov (gazum)
Assigned to: Martin v. Löwis (loewis)
Summary: unknown parsing error

Initial Comment:
I am getting "unknown parsing error" when trying to run 
a script with a following first line: 
#@+leo-encoding=cp1251.

If I add a couple of empty lines or 
# -*- coding: cp1251 -*-
then everything is ok.

I am using ActiveState python 2.3.3 on
Win2K server.

---------- Python ----------
error=22
  File "test.py", line 1
SyntaxError: unknown parsing error

Output completed (0 sec consumed) - Normal 
Termination
------------------------------
#@+leo-encoding=cp1251.
#@+node:0::@file test.py
#@+body
for i in range(5):
	print i
#@-body
#@-node:0::@file test.py
#@-leo


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2004-07-21 07:36

Message:
Logged In: YES 
user_id=21627

The patch is wrong. The PEP deliberately allows for
arbitrary occurrences of the substring "coding", in
particular inside "encoding". This was made so that other
editors, like vi or LEO, can continue to use their own
encoding declarations, and Python would recognize them.

Unfortunately, LEO decided to add a full stop at the end of
the line, so Python looks for an encoding named "cp1251.".
We agree with the LEO author that this is a problem in LEO,
and will be fixed. Alternatively, we could amend the PEP and
declare that trailing dots are not part of the encoding name.

The other part of the patch is correct; I have applied it as
pythonrun.c 2.195.6.6 and 2.207. It would be even better if
we could display the actual cause of the problem, but that
is currently not supported in the parser.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2004-07-21 05:16

Message:
Logged In: YES 
user_id=33168

Martin, I hope you don't mind me assigning this to you.  I
think you implemented the coding spec.  I briefly read the
PEP and while the code does what the PEP states (ie, use a
regex), the behaviour doesn't match the examples.  It also
seems like it could be error prone to allow r'#.*coding[:=]'

I think there are two issues.  
1) in pythonrun.c in E_DECODE there is a missing break
2) the check for # -*- coding is not strict enough
    The patch makes the check r'# (-\*-)? coding[:=]'

The attached patch addresses both issues, although I'm not
sure you will agree #2 is a problem.  

Feel free to checkin, assign back to me or whatever.  I'm
not sure what the error message in pythonrun should be,
right now it's "unknown decode error."  Perhaps that should
be "invalid encoding" or something?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=984714&group_id=5470


More information about the Python-bugs-list mailing list