Internals of interning strings

Michael Hudson mwh21 at cam.ac.uk
Fri Mar 24 12:07:13 EST 2000


"Emile van Sebille" <emile at fenx.com> writes:

> I hadn't done anything with 'intern' before, so I tried your
> example.  Of course, to save a moment, I entered:
> 
> Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
> Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
> >>> a = 'testing'
> >>> b = a
> >>> c = intern(a)
> >>> d = 'testing'
> >>> e = intern(d)
> >>> a is b
> 1
> >>> b is c
> 1
> >>> a is e
> 1
> >>> e is d
> 1
> >>>
> 
> This is inconsistent with your results.  In light of Mike Hudson's
> response, might optimizations be happening differently on long vs
> short strings?  Or is something else going on here?

It's not short strings, it's those that only contain characters that
can legally be present in identifiers.  Look in Python/compile.c
arounf line 250 for the gory details; the bit relavent to the above
behaviour is:

        /* Intern selected string constants */
        for (i = PyTuple_Size(consts); --i >= 0; ) {
                PyObject *v = PyTuple_GetItem(consts, i);
                char *p;
                if (!PyString_Check(v))
                        continue;
                p = PyString_AsString(v);
                if ((int)strspn(p, NAME_CHARS)
                    != PyString_Size(v))
                        continue;
                PyString_InternInPlace(&PyTuple_GET_ITEM(consts, i));
        }

Cheers,
M.

-- 
very few people approach me in real life and insist on proving they are
drooling idiots.                         -- Erik Naggum, comp.lang.lisp



More information about the Python-list mailing list