[Tutor] <var:data> assignment [Python under the hood:
optimizations at the C level]
Danny Yoo
dyoo at hkn.eecs.berkeley.edu
Tue May 11 21:59:12 EDT 2004
On Wed, 12 May 2004, Magnus Lycka wrote:
> I wrote:
> > > If you know that you use a particular string often, or need to make it
> > > faster to i.e. speed up dictionary access with this string as a key, you
> > > can force Python to intern it. (It's only for strings you can do this.)
> ..
> > > I'm not sure exactly what algorithm Python uses to decide which objects
> > > to intern automagically.
>
> Danny responded:
> > The word "intern" really should apply to strings; I don't think intern()
> > works on arbitrary objects.
>
> I just wrote that!
Hi Magnus,
My apologies! I read your message too quickly, and skipped over the part
where you mentioned that it worked on strings only.
I have a bad habit of tunnel vision --- good when debugging code, but not
so good when communicating with people. *grin* I will try to be a better
listener next time.
> As I look further, it seems that it has nothing to do with size as I
> thought, but that all strings that are valid Python identifiers are
> interned.
>
> The manual says that "Normally, the names used in Python programs are
> automatically interned, and the dictionaries used to hold module, class
> or instance attributes have interned keys."
>
> It's obviously more extensive than that. All string *literals* that
> could possibly be names used in Python programs seems to get interned.
Yes, it's done at bytecode-compile time. In Python/compile.c, there's a
step that interns all variable names and literals that are "name"-like
characters:
/******/
PyCodeObject *
PyCode_New(int argcount, int nlocals, int stacksize, int flags,
PyObject *code, PyObject *consts, PyObject *names,
PyObject *varnames, PyObject *freevars, PyObject *cellvars,
PyObject *filename, PyObject *name, int firstlineno,
PyObject *lnotab)
{
[some code cut]
intern_strings(names);
intern_strings(varnames);
intern_strings(freevars);
intern_strings(cellvars);
/* Intern selected string constants */
for (i = PyTuple_Size(consts); --i >= 0; ) {
PyObject *v = PyTuple_GetItem(consts, i);
if (!PyString_Check(v))
continue;
if (!all_name_chars((unsigned char
*)PyString_AS_STRING(v)))
continue;
PyString_InternInPlace(&PyTuple_GET_ITEM(consts, i));
}
/******/
(Code taken from Python 2.3.3 C source)
Again, this is an C optimization hack that isn't documented: it's not
documented because we really shouldn't depend on this behavior! *grin*
In fact, I have no idea what Jython does. Let's check it:
###
[dyoo at tesuque dyoo]$ jython
Jython 2.1 on java1.4.1_01 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>> id("hello world")
17064560
>>> id("hello world")
22629283
>>> id("hello world")
11354272
###
Ah. Yup, it does something different in Jython. Hence, it's really an
implemention detail that we really shouldn't be looking at. But I get the
feeling we've completely strayed off the original topic anyway. *grin*
More information about the Tutor
mailing list