[New-bugs-announce] [issue39465] Design a subinterpreter friendly alternative to _Py_IDENTIFIER

Nick Coghlan report at bugs.python.org
Mon Jan 27 09:28:30 EST 2020


New submission from Nick Coghlan <ncoghlan at gmail.com>:

Both https://github.com/python/cpython/pull/18066 (collections module) and https://github.com/python/cpython/pull/18032 (asyncio module) ran into the problem where porting them to multi-phase initialisation involves replacing their usage of the `_Py_IDENTIFIER` macro with some other mechanism.

When _posixsubprocess was ported, the replacement was a relatively ad hoc combination of string interning and the interpreter-managed module-specific state: https://github.com/python/cpython/commit/5a7d2e11aaea2dd32878dc5c6b1aae8caf56cb44

I'm wondering if we may able to devise a comparable struct-field based system that replaces the `_Py_IDENTIFIER` local static variable declaration macro and the `Py_Id_<name>` lookup convention with a combination like (using the posix subprocess module conversion as an example):

    // Identifier usage declaration (replaces _Py_IDENTIFIER)
    _Py_USE_CACHED_IDENTIFIER(_posixsubprocessstate(m), disable);

    // Identifier usage remains unchanged, but uses a regular local variable
    // rather than the static variable declared by _Py_IDENTIFIER
    result = _PyObject_CallMethodIdNoArgs(gc_module, &PyId_disable);

And then the following additional state management macros would be needed to handle the string interning and reference counting:

    // Module state struct declaration
    typedef struct {
        // This would declare an initialised array of _Py_Identifier structs
        // under a name like __cached_identifiers__. The end of the array
        // would be indicated by a strict with "value" set to NULL.
        _Py_START_CACHED_IDENTIFIERS;
        _Py_CACHED_IDENTIFIER(disable);
        _Py_CACHED_IDENTIFIER(enable);
        _Py_CACHED_IDENTIFIER(isenabled);
        _Py_END_CACHED_IDENTIFIERS;
        );
    } _posixsubprocessstate;

    // Module tp_traverse implementation
    _Py_VISIT_CACHED_IDENTIFIERS(_posixsubprocessstate(m));

    // Module tp_clear implementation (also called by tp_free)
    _Py_CLEAR_CACHED_IDENTIFIERS(_posixsubprocessstate(m));

With the requirement to declare usage of the cached identifiers, they could be lazily initialized the same way the existing static variables are (even re-using the same struct declaration).

Note: this is just a draft of one possible design, the intent of this issue is to highlight the fact that this issue has now come up multiple times, and it would be good to have a standard answer available.

----------
messages: 360766
nosy: eric.snow, ncoghlan, petr.viktorin, shihai1991
priority: normal
severity: normal
status: open
title: Design a subinterpreter friendly alternative to _Py_IDENTIFIER

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39465>
_______________________________________


More information about the New-bugs-announce mailing list