On Wed, Jul 03, 2002 at 02:17:12PM -0400, Jeremy Hylton wrote:
"OT" == Oren Tirosh
writes: OT> This thought experiment is part of a strange fantasy I have that OT> Python might one day use only interned strings to represent OT> names. There are relatively few places where a string may be OT> converted to a name (getattr, hasattr, etc) and these could be OT> interned at the interface if interned strings are not OT> immortal. I expect that nothing will ever come out of this, but OT> it's fun to think about it anyway...
two responses:
What do you mean by "represent names"? Code objects already use interned strings for names. Did you have something else in mind?
Not something else - just more of the same. Interned names in co_names tuples are a good start but there are tons of places where literal C-strings are used such as in descriptors. These names are converted to temporary Python strings on demand. My humble goal is for any name that has a predefined meaning in Python to appear exactly once in the executable and that instance will be in the form of a static preinitialized Python string object, not a C string literal. Here's how it might work: to use the name 'foo' you just refer to the C name PYSYMfoo. During build a helper program scans all C sources for names starting with PYSYM and automatically generates a .c file where each of these names appears once as a pre-initialized string object and an .h file included by Python.h. On startup all these string objects are interned, of course. So any name used from C is resolved by the linker to point to the interned single instance. Any name appearing unquoted in Python code is interned when when it's compiled or loaded from the .pyc file. There are some cases where a string becomes a name such as the arguments to functions like getattr and hasattr. These would need to be interned before reaching the 100% interned core of the language. I guess this could be done by a new PyArgs_ParseTuple format char. This obviously requires interned strings to be non-immortal. For example: if (strcmp(sname, "__class__") == 0) becomes if (if sname == PYSYM__class__) This is a pretty trivial example but I have other ideas for optimizations and cleanups that this would enable. These might lead to significant improvements in code size and performance. Well, that's my fantasy. There are still some "minor" problems like totally breaking the C API. Oren