
12.06.20 11:32, Inada Naoki пише:
Hi, all.
Py_UNICODE has been deprecated since PEP 393 (Flexible string representation).
wchar_t* cache in the string object is used only in deprecated APIs. It waste 1 word (8 bytes on 64bit machine) per string instance.
The deprecated APIs are documented as "Deprecated since version 3.3, will be removed in version 4.0." See https://docs.python.org/3/c-api/unicode.html#deprecated-py-unicode-apis
But when PEP 393 is implemented, no one expects 3.10 will be released. Can we reschedule the removal?
My proposal is, schedule the removal on Python 3.11. But we will postpone the removal if we can not remove its usage until it.
I have a plan for more graduate removing of this feature. I created a PR which adds several compile options, so Python can be built in one of three modes: 1. Support wchar_t* cache and use it. It is the current mode. 2. Support wchar_t* cache, but do not use it internally in CPython. It can be used to test whether getting rid of the wchar_t* cache can have negative effects. 3. Do not support wchar_t* cache. It is binary incompatible build. Its purpose is to allow authors of third-party libraries to prepare to future breakage. The plan is: 1. Add support of the above compile options. Unfortunately I did not have time to do this before feature freeze in 3.9, but maybe make an exception? 2. Make option 2 default. 3. Remove option 1. 4. Enable compiler deprecations for all legacy C API. Currently they are silenced for the C API used internally. 5. Make legacy C API always failing. 6. Remove legacy C API from header files. There is a long way to steps 5 and 6. I think 3.11 is too early. https://bugs.python.org/issue36346 https://github.com/python/cpython/pull/12409