I'm planning on moving us to a simpler, more efficient alternative to _Py_IDENTIFIER(), but want to see if there are any objections first before moving ahead. Also see https://bugs.python.org/issue46541. _Py_IDENTIFIER() was added in 2011 to replace several internal string object caches and to support cleaning up the cached objects during finalization. A number of "private" functions (each with a _Py_Identifier param) were added at that time, mostly corresponding to existing functions that take PyObject* or char*. Note that at present there are several hundred uses of _Py_IDENTIFIER(), including a number of duplicates. My plan is to replace our use of _Py_IDENTIFIER() with statically initialized string objects (as fields under _PyRuntimeState). That involves the following: * add a PyUnicodeObject field (not a pointer) to _PyRuntimeState for each string that currently uses _Py_IDENTIFIER() (or _Py_static_string()) * statically initialize each object as part of the initializer for _PyRuntimeState * add a macro to look up a given global string * update each location that currently uses _Py_IDENTIFIER() to use the new macro instead Pros: * reduces indirection (and extra calls) for C-API functions that need the strings (making the code a little easier to understand and speeding it up) * the objects are referenced from a fixed address in the static data section instead of the heap (speeding things up and allowing the C compiler to optimize better) * there is no lazy allocation (or lookup, etc.) so there are fewer possible failures when the objects get used (thus less error return checking) * saves memory (at little, at least) * if needed, the approach for per-interpreter is simpler * helps us get rid of several hundred static variables throughout the code base * allows us to get rid of _Py_IDENTIFIER() and a bunch of related C-API functions * "deep frozen" modules can use the global strings * commonly-used strings could be pre-allocated by adding _PyRuntimeState fields for them Cons: * a little less convenient: adding a global string requires modifying a separate file from the one where you actually want to use the string * strings can get "orphaned" (I'm planning on checking in CI) * some strings may never get used for any given ./python invocation (not that big a difference though) I have a PR up (https://github.com/python/cpython/pull/30928) that adds the global strings and replaces use of _Py_IDENTIFIER() in our code base, except for in non-builtin stdlib extension modules. (Those will be handled separately if we proceed.) The PR also adds a CI check for "orphaned" strings. It leaves _Py_IDENTIFIER() for now, but disallows any Py_BUILD_CORE code from using it. With that change I'm seeing a 1% improvement in performance (see https://github.com/faster-cpython/ideas/issues/230). I'd also like to actually get rid of _Py_IDENTIFIER(), along with other related API including ~14 (private) C-API functions. Dropping all that helps reduce maintenance costs. However, at least one PyPI project (blender) is using _Py_IDENTIFIER(). So, before we could get rid of it, we'd first have to deal with that project (and any others). To sum up, I wanted to see if there are any objections before I start merging anything. Thanks! -eric