[ python-Bugs-1163244 ] Syntax error on large file with MBCS encoding

SourceForge.net noreply at sourceforge.net
Fri Apr 15 01:40:37 CEST 2005


Bugs item #1163244, was opened at 2005-03-14 21:20
Message generated for change (Comment added) made by doerwalter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1163244&group_id=5470

Category: Parser/Compiler
Group: Python 2.4
Status: Open
Resolution: Accepted
Priority: 7
Submitted By: Tim N. van der Leeuw (tnleeuw)
Assigned to: Nobody/Anonymous (nobody)
Summary: Syntax error on large file with MBCS encoding

Initial Comment:
Large files generated by make-py.py from the win32all
extensions cannot be compiled by Python2.4.1rc1 - they
give a syntax error.

This is a regression from 2.3.5

(With Python2.4, the interpreter crashes. That is fixed
now.)

Removing the mbcs encoding line from the top of the
file, compilation succeeds.

File should be attached, as zip-file. Probably requires
win32all extensions to be installed to be compiled /
imported (generated using build 203 of the win32all
extensions).


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2005-04-15 01:40

Message:
Logged In: YES 
user_id=89016

Importing foo2.py on Linux (with the current CVS HEAD
version of Python) gives me a segmentation fault with the
following stacktrace:
0x080606cc in instance_repr (inst=0xb7c158bc) at
Objects/classobject.c:880
880                     classname = inst->in_class->cl_name;
(gdb) bt
#0  0x080606cc in instance_repr (inst=0xb7c158bc) at
Objects/classobject.c:880
#1  0x08082235 in PyObject_Repr (v=0xb7c158bc) at
Objects/object.c:308
#2  0x080f3ccd in err_input (err=0xbfffe000) at
Python/pythonrun.c:1478
#3  0x080f3956 in PyParser_SimpleParseFileFlags
(fp=0x818d6e0, filename=0xbfffe530 "foo2.py", start=257,
flags=0)
    at Python/pythonrun.c:1348
#4  0x080f3982 in PyParser_SimpleParseFile (fp=0x818d6e0,
filename=0xbfffe530 "foo2.py", start=257)
    at Python/pythonrun.c:1355
#5  0x080e6fef in parse_source_module (pathname=0xbfffe530
"foo2.py", fp=0x818d6e0) at Python/import.c:761
#6  0x080e72db in load_source_module (name=0xbfffe9d0
"foo2", pathname=0xbfffe530 "foo2.py", fp=0x818d6e0)
    at Python/import.c:885
#7  0x080e86b4 in load_module (name=0xbfffe9d0 "foo2",
fp=0x818d6e0, buf=0xbfffe530 "foo2.py", type=1, loader=0x0)
    at Python/import.c:1656
#8  0x080e9d52 in import_submodule (mod=0x8145768,
subname=0xbfffe9d0 "foo2", fullname=0xbfffe9d0 "foo2")
    at Python/import.c:2250
#9  0x080e9511 in load_next (mod=0x8145768,
altmod=0x8145768, p_name=0xbfffedf0, buf=0xbfffe9d0 "foo2",
p_buflen=0xbfffe9cc)
    at Python/import.c:2070
#10 0x080e8e5e in import_module_ex (name=0x0,
globals=0xb7d62e94, locals=0xb7d62e94, fromlist=0x8145768)
    at Python/import.c:1905
#11 0x080e914b in PyImport_ImportModuleEx (name=0xb7cd8824
"foo2", globals=0xb7d62e94, locals=0xb7d62e94, 
    fromlist=0x8145768) at Python/import.c:1946
#12 0x080b5c87 in builtin___import__ (self=0x0,
args=0xb7d1e634) at Python/bltinmodule.c:45
#13 0x0811d32e in PyCFunction_Call (func=0xb7d523ec,
arg=0xb7d1e634, kw=0x0) at Objects/methodobject.c:73
#14 0x0805d188 in PyObject_Call (func=0xb7d523ec,
arg=0xb7d1e634, kw=0x0) at Objects/abstract.c:1757
#15 0x080ca79d in PyEval_CallObjectWithKeywords
(func=0xb7d523ec, arg=0xb7d1e634, kw=0x0) at Python/ceval.c:3425
#16 0x080c6719 in PyEval_EvalFrame (f=0x816dd7c) at
Python/ceval.c:2026
#17 0x080c8fdd in PyEval_EvalCodeEx (co=0xb7cf1ef0,
globals=0xb7d62e94, locals=0xb7d62e94, args=0x0, argcount=0,
kws=0x0, 
    kwcount=0, defs=0x0, defcount=0, closure=0x0) at
Python/ceval.c:2736
#18 0x080bffb0 in PyEval_EvalCode (co=0xb7cf1ef0,
globals=0xb7d62e94, locals=0xb7d62e94) at Python/ceval.c:490
#19 0x080f361d in run_node (n=0xb7d122d0, filename=0x8123ba3
"<stdin>", globals=0xb7d62e94, locals=0xb7d62e94, 
    flags=0xbffff584) at Python/pythonrun.c:1265
#20 0x080f1f58 in PyRun_InteractiveOneFlags (fp=0xb7e94720,
filename=0x8123ba3 "<stdin>", flags=0xbffff584)
    at Python/pythonrun.c:762
#21 0x080f1c93 in PyRun_InteractiveLoopFlags (fp=0xb7e94720,
filename=0x8123ba3 "<stdin>", flags=0xbffff584)
    at Python/pythonrun.c:695
#22 0x080f1af6 in PyRun_AnyFileExFlags (fp=0xb7e94720,
filename=0x8123ba3 "<stdin>", closeit=0, flags=0xbffff584)
    at Python/pythonrun.c:658
#23 0x08055e45 in Py_Main (argc=1, argv=0xbffff634) at
Modules/main.c:484
#24 0x08055366 in main (argc=1, argv=0xbffff634) at
Modules/python.c:23

The value object in err_input() (in the E_DECODE case) seems
to be bogus (it gives me a refcount of -606348325).

----------------------------------------------------------------------

Comment By: Timo Linna (tilinna)
Date: 2005-04-09 10:09

Message:
Logged In: YES 
user_id=1074183

Seems that the connection to n*512 blocks is very likely,
and it's not just MBCS-related. I managed to reproduce this
with a file that contains an ascii-coding declaration,
close-to-1024 bytes section, extra crlf and a comment which
raises a SyntaxError in Py2.4.1.

Could this be linked to the new codec buffering code? See:
www.python.org/sf/1178484


----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2005-03-21 14:34

Message:
Logged In: YES 
user_id=539787

Could be irrelevant but... are the other block sizes close
to n*512 (eg 1536 is 3*512) marks?

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2005-03-21 13:11

Message:
Logged In: YES 
user_id=14198

I believe this is a different bug than the recent
"long-lines" errors (see below).  I can reproduce this with
a file that uses neither long lines, nor any pywin32
extensions (2.4 branch, trunk)

A Python source file containing:
-- start snippet --
# -*- coding: mbcs -*-
<1532 characters of code or comments>
<cr/lf newline>
x = {}
-- end snippet --

Will yield a SyntaxError when attempting to import the
module.  Running the module as a script does not provoke the
error.
    
To reproduce, there must be exactly 1532 characters where
specified (see the attached file for a demo).  Adding or
removing even a single character will prevent the error.  It
is possible to replace characters with any others, including
valid code, and still see the error - however, the number of
characters must remain the same .cr/lf pairs can also be
replaced with any other 2 characters.  There are other
"block sizes" that will provoke the error, but this is the
only one I have nailed.
    
Apart from the "block" of 1532 characters, the coding line
and the blank line before the dict assignment also appear
critical.  Unlike the other characters in the block, this
last cr/lf pair can not be replaced with comments.  I can't
provoke the error with other encodings (note there are no
encoded characters in the sample - it is trivial).

To reproduce, save the attached file on Windows and execute:
> python -c "import foo2"
Traceback (most recent call last):
  File "<string>", line 1, in ?
  File "foo2.py", line 24
x = {}
    ^
SyntaxError: invalid syntax

Note that Python 2.3 and earlier all work.  Also note that
"python foo2.py" also works.  The code is clearly valid.
    
Haven't tried to repro on Linux (mbcs isn't available there,
and I can't get a test case that doesn't use it)

Other pointers/notes: pywin32 bug 1085454 is related to
long-lines, by all accounts that underlying error has been
fixed - I can't verify this as pywin32 no longer generates
insanely long lines.  I can confirm Python bugs
1101726/1089395 still crashes Python 2.3+.  I believe all
(including this) are discrete bugs.

[foo2.py is my attachment - ya gotta love sourceforge :)]

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2005-03-20 11:28

Message:
Logged In: YES 
user_id=539787

Useful pointers: in Python-dev, this has been characterised
as related to pywin32 bug 1085454.  Also related to
www.python.org/sf/1101726 and www.python.org/sf/1089395.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1163244&group_id=5470


More information about the Python-bugs-list mailing list