[Patches] [ python-Patches-576101 ] Alternative implementation of interning

noreply@sourceforge.net noreply@sourceforge.net
Sat, 10 Aug 2002 03:01:39 -0700


Patches item #576101, was opened at 2002-07-01 19:23
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Oren Tirosh (orenti)
Assigned to: Nobody/Anonymous (nobody)
Summary: Alternative implementation of interning

Initial Comment:
An interned string has a flag set indicating that it is 
interned instead of a pointer to the interned string. This 
pointer was almost always either NULL or pointing to the 
same object. The other cases were rare and ineffective 
as an optimization.  This saves an average of 3 bytes 
per string.

Interned strings are no longer immortal.  They are 
automatically destroyed when there are no more 
references to them except the global dictionary of 
interned strings.

New function (actually a macro) PyString_CheckInterned 
to check whether a string is interned.  There are no 
more references to ob_sinterned anywhere outside 
stringobject.c.


----------------------------------------------------------------------

>Comment By: Oren Tirosh (orenti)
Date: 2002-08-10 10:01

Message:
Logged In: YES 
user_id=562624

General cleanup. Better handling of immortal interned
strings for backward compatibility.

It passes regrtest but causes test_gc to leak 20 objects. 13
from test_finalizer_newclass and 7 from test_del_newclass,
but only if test_saveall is used. I've tried earlier
versions of this patch (which were ok at the time) and they
now create this leak too.

----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-07-06 16:08

Message:
Logged In: YES 
user_id=562624

Oops, forgot to actually attach the patch. Here it is.


----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-07-06 14:35

Message:
Logged In: YES 
user_id=562624

This implementation supports both mortal and immortal interned 
strings.

PyString_InternInPlace creates an immortal interned string for 
backward compatibility with code that relies on this behavior.

PyString_Intern creates a mortal interned string that is 
deallocated when its refcnt reaches 0.  Note that if the string 
value has been previously interned as immortal this will not 
make it mortal.

Most places in the interpreter were changed to PyString_Intern 
except those that may be required for compatibility.

This version of the patch, like the previous one, disables 
indirect interning. Is there any evidence that it is still an 
important optimization for some packages?

Make sure you rebuild everything after applying this patch 
because it modifies the size of string object headers.



----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-07-02 04:21

Message:
Logged In: YES 
user_id=80475

I like the way you consolidated all of the knowledge about 
interning into one place.

Consider adding an example to the docs of an effective use 
of interning for optimization.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470