[Python-bugs-list] [ python-Bugs-603509 ] MemoryError when eval'ing string
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 03 Sep 2002 04:54:55 -0700
Bugs item #603509, was opened at 2002-09-02 15:56
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=603509&group_id=5470
Category: Python Interpreter Core
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Tim Peters (tim_one)
Assigned to: Martin v. Löwis (loewis)
Summary: MemoryError when eval'ing string
Initial Comment:
eval("'label;home;encoding=quoted-printable:r.'")
dies with a bogus MemoryError. Assigned to Martin
because this minimal substring dies the same way:
eval("'coding=q'")
Of course the result should be the string
coding=q
Somehow it looks like parsing a string literal is getting
mixed up with searching for a source-file encoding.
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2002-09-03 13:54
Message:
Logged In: YES
user_id=21627
Fixed with
tokenizer.c 2.66;
tokenizer.h 2.19;
ref2.tex 1.43;
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-09-02 23:23
Message:
Logged In: YES
user_id=31435
I don't understand the deeper issues here, but
eval(repr(s)) == s
must be true for every string s. Take that as an absolute
requirement and I'm sure you'll find a way to do it <wink>.
Waiting for a complaint isn't really an option. It's been
perfectly safe to dump strings out to text files via repr(),
and restore them via eval(), since Python's first release.
The program I was running when this happened was doing
exactly that. The strings it was dumping and restoring
came from c.l.py msgs, and there's no string that can be
guaranteed not to show up there. In particular, it's likely
that a msg containing an encoding decoration will show up
there as an example.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-09-02 19:40
Message:
Logged In: YES
user_id=21627
The attached patch fixes the problem. It is still possible
to trick this code, with
eval("'#coding=q'")
I'm not really sure how to deal with that; I see the
following options:
1. tighten PEP 263 to require that the encoding comment is
the only thing in a source line.
2. perform some minimal scanning of the line, to see whether
we are inside a string literal when we see the #. This can
probably be tricked with a multi-line string.
3. perform source encoding analysis after in the tokenizer
proper, where comments are detected. This would be a heavy
change.
4. Just apply this patch, and wait until somebody complains.
Directions appreciated.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=603509&group_id=5470