[Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle

noreply@sourceforge.net noreply@sourceforge.net
Mon, 12 Aug 2002 16:40:08 -0700


Patches item #505705, was opened at 2002-01-19 10:21
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Out of Date
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: Martin v. Löwis (loewis)
Summary: Remove eval in pickle and cPickle

Initial Comment:
This patch removes the use of eval in pickle and cPickle.

It does so by:
- moving the actual parsing from compile.c:parsestr to
PyString_DecodeEscape
- introducing a new codec string-escape
- removing the code that checks that a
string-to-unpickle is properly escaped throughout, and
replaces this with a check whether it is properly quoted,
- unquoting the string in load_string, then passing it
to the codec.

This fixes #502503.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-08-13 01:40

Message:
Logged In: YES 
user_id=21627

New version:
- removed tripple quote support in pickle/cPickle

- added Lib/encodings/string_escape.py again

- added PyString_Repr, which takes a smartquotes argument

- recode_encoding is for PEP 263: the parser generates UTF-8
in the abstract syntax, which needs to be re-encoded with
the original encoding. Unfortunately, \-escaping and UTF-8
may interleave, hence the convoluted code.

On the Sam Penrose article: Without patch:
dumping list of 1000 dicts:
dumped: 0.192386031151
loading list of 1000 dicts:
loaded: 2.46496498585
dumping list of 10000 dicts:
dumped: 1.92456102371
loading list of 10000 dicts:
loaded: 24.6884089708

with patch:
dumping list of 1000 dicts:
dumped: 0.201091051102
loading list of 1000 dicts:
loaded: 0.469774007797
dumping list of 10000 dicts:
dumped: 1.94221496582
loading list of 10000 dicts:
loaded: 4.8661159277

So loading speed is up by a factor of 5, for this benchmark.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-08-12 03:51

Message:
Logged In: YES 
user_id=6380

Closer.

- Why bother stripping triple quotes in the pickle/cPickle
load code? These will never happen as a result of a pickle
dump AFAIK, and the code you are replacing doesn't accept
these either AFAICT.

- There's something missing (the previous version of the
patch had it I believe) that's needed to register the codec;
as a consequence, pickle.loads() doesn't work.

- escape_encode() uses repr() of a string to do the work.
But that means the outcome for embedding string quotes is
confusing, because of the "smarts" in repr() that use " for
surrounding quotes when there's a ' in the string, and vice
versa. Thus, a single quote or a double quote is returned
unquoted; but if they both occur in the same string, the
single quote is quoted. I don't think that's particularly
useful. Maybe there should be an underlying primitive
operation that gives you a choice and which is invoked both
by escape_encode() and string repr()?

- I don't understand the recode_encoding stuff, but it looks
like something like that was present before too. :-)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-08-11 22:47

Message:
Logged In: YES 
user_id=21627

Updated to current CVS.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-08-11 16:09

Message:
Logged In: YES 
user_id=6380

This would fix bug #593656 too.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-08-09 16:48

Message:
Logged In: YES 
user_id=6380

I like this idea. But the patch is out of date. Can you
rework the patch? How much faster does this make the test
program from

http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org

???

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-08-09 06:12

Message:
Logged In: YES 
user_id=6380

I'll review this.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-01-19 10:25

Message:
Logged In: YES 
user_id=21627

BTW, this patch has #500002 as a prerequisite.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470