
Hi INADA-san, IMO Python 3.11 is too early because we don't emit a DeprecationWarning on every single deprecation function. 1) Emit a DeprecationWarning at runtime (ex: Python 3.10) 2) Wait two Python releases: see https://discuss.python.org/t/pep-387-backwards-compatibilty-policy/4421 3) Remove the deprecated feature (ex: Python 3.12) I don't understand if *all* deprecated functions are causing implementation issues, or only a few of them? PyUnicode_AS_UNICODE() initializes PyASCIIObject.wstr if needed, and then return PyASCIIObject.wstr. I don't think that PyASCIIObject.wstr can be called "a cache": there are functions relying on this member. On the other hand, PyUnicode_FromUnicode(str, size) is basically a wrapper to PyUnicode_FromWideChar(): it doesn't harm to keep this wrapper to ease migration. Only PyUnicode_FromUnicode(NULL, size) is causing troubles, right? Is there a list of deprecated functions and is it possible to group them in two categories: must be removed and "can be kept for a few more releases"? If the intent is to reduce Python memory footprint, PyASCIIObject.wstr can be moved out of PyASCIIObject structure, maybe we can imagine a WeakDict. It would map a Python str object to its wstr member (wchar_* string). If the Python str object is removed, we can release the wstr string. The technical problem is that it is not possible to create a weak reference to a Python str. We may insert code in unicode_dealloc() to delete manually the wstr in this case. Maybe a _Py_hashtable_t of pycore_hashtable.h could be used for that. Since this discussion is on-going for something like 5 years in multiple bugs.python.org issues and email threads, maybe it would help to have a short PEP describing issues of the deprecated functions, explain the plan to migrate to the new functions, and give a schedule of the incompatible changes. INADA-san: would you be a candidate to write such PEP? Victor Le ven. 12 juin 2020 à 10:37, Inada Naoki <songofacandy@gmail.com> a écrit :
Hi, all.
Py_UNICODE has been deprecated since PEP 393 (Flexible string representation).
wchar_t* cache in the string object is used only in deprecated APIs. It waste 1 word (8 bytes on 64bit machine) per string instance.
The deprecated APIs are documented as "Deprecated since version 3.3, will be removed in version 4.0." See https://docs.python.org/3/c-api/unicode.html#deprecated-py-unicode-apis
But when PEP 393 is implemented, no one expects 3.10 will be released. Can we reschedule the removal?
My proposal is, schedule the removal on Python 3.11. But we will postpone the removal if we can not remove its usage until it.
I grepped the use of the deprecated APIs from top 4000 PyPI packages.
result: https://github.com/methane/notes/blob/master/2020/wchar-cache/deprecated-use step: https://github.com/methane/notes/blob/master/2020/wchar-cache/README.md
I noticed:
* Most of them are generated by Cython. * I reported it to Cython so Cython 0.29.21 will fix them. I expect more than 1 year between Cython 0.29.21 and Python 3.11rc1. * Most of them are `PyUnicode_FromUnicode(NULL, 0);` * We may be able to keep PyUnicode_FromUnicode, but raise error when length>0.
Regards,
-- Inada Naoki <songofacandy@gmail.com> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7JVC3IKS... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.