You *can* allocate unicode objects statically. We do it in deepfreeze, and Eric's PR under discussion here (https://github.com/python/cpython/pull/30928) does it. I wonder if a better solution than that PR wouldn't be to somehow change the implementation of _Py_IDENTIFIER() to do that, and make the special 'Id' APIs just aliases for the corresponding unicode-object-based APIs? It wouldn't be ABI compatible, but none of those APIs are in the stable ABI.

(Is there a requirement that an Id only contain ASCII characters (i.e., 7-bit)?)

On Fri, Feb 4, 2022 at 12:52 PM Steve Dower <steve.dower@python.org> wrote:
On 2/4/2022 5:37 PM, Eric Snow wrote:
> On Thu, Feb 3, 2022 at 3:49 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
>> I suppose I'd like to know what the value of _Py_IDENTIFIER() is for
>> 3rd party modules.
>
> Between Guido, Victor, Stefan, and Sebastian, I'm getting the sense
> that a public replacement for _Py_IDENTIFER() would be worth pursuing.
> Considering that it would probably help numpy move toward
> subinterpreter support, I may work on this after all. :)
>
> (For core CPython we'll still benefit from the statically initialized
> strings, AKA gh-30928.)

For me, I'd love to see a statically allocated string type (for a real
example that's used in Windows, check out [1], specifically when he gets
to the fast-pass strings).

Essentially, a bare minimum struct around a char* (and/or wchar_t*) that
acts as a PyUnicodeObject but doesn't ever allocate anything on the
heap. This would also be helpful for config strings, which are often
static but need to be copied around a lot, and good for passthrough
strings that a native extension or host app might insert and receive
back unmodified.

Because there's nothing to deallocate, it can be "created" and stored
however the caller likes. As soon as Python code does anything with it
other than passing it around, a regular PyUnicodeObject is allocated
(just like the HSTRING example).

I'd expect usage to look very much like the intern examples earlier in
the thread, but if we actually return a whole struct then we aren't on
the hook for the allocations.

Cheers,
Steve

[1]: https://devblogs.microsoft.com/oldnewthing/20160615-00/?p=93675
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DGX4GBMDJYYFAE7OSVMBGKYAO2HPP3PT/
Code of Conduct: http://python.org/psf/codeofconduct/


--
--Guido van Rossum (python.org/~guido)
Pronouns: he/him (why is my pronoun here?)