Hi, all.
Py_UNICODE has been deprecated since PEP 393 (Flexible string representation).
wchar_t* cache in the string object is used only in deprecated APIs.
It waste 1 word (8 bytes on 64bit machine) per string instance.
The deprecated APIs are documented as "Deprecated since version 3.3,
will be removed in version 4.0."
See https://docs.python.org/3/c-api/unicode.html#deprecated-py-unicode-apis
But when PEP 393 is implemented, no one expects 3.10 will be released.
Can we reschedule the removal?
My proposal is, schedule the removal on Python 3.11. But we will postpone
the removal if we can not remove its usage until it.
I grepped the use of the deprecated APIs from top 4000 PyPI packages.
result: https://github.com/methane/notes/blob/master/2020/wchar-cache/deprecated-use
step: https://github.com/methane/notes/blob/master/2020/wchar-cache/README.md
I noticed:
* Most of them are generated by Cython.
* I reported it to Cython so Cython 0.29.21 will fix them. I expect
more than 1 year
between Cython 0.29.21 and Python 3.11rc1.
* Most of them are `PyUnicode_FromUnicode(NULL, 0);`
* We may be able to keep PyUnicode_FromUnicode, but raise error when length>0.
Regards,
--
Inada Naoki