
On Tue, Feb 2, 2021 at 12:43 AM M.-A. Lemburg <mal@egenix.com> wrote:
Hi Inada-san,
thank you for adding some comments, but they are not really capturing what I think is missing:
""" Removing these APIs removes ability to use codec without temporary Unicode.
Codecs can not encode Unicode buffer directly without temporary Unicode object since Python 3.3. All these APIs creates temporary Unicode object for now. So removing them doesn't reduce any abilities. """
The point is that while the decoders allow going from a C object to a Python object directly, we are missing a way to do the same for the encoders, since the Python 3.3 change in the Unicode internals.
At the very least, we should have such APIs for going from wchar_t* to a Python object.
We already have PyUnicode_FromWideChar(). So I assume you mean "wchar_t* to Python bytes object".
The alternatives you provide all require creating an intermediate Python object for this purpose. The APIs you want to remove do that as well, but that's not the point. The point is to expose the codecs' decode mechanism which is available in the C code, but currently not exposed via C APIs, e.g. ucs4lib_utf8_encode().
It would be breaking change, but those APIs in your list could simply be changed from using Py_UNICODE to using whcar_t instead and then interface directly to the internal functions we have for the encoders.
OK, I see codecs.h has three encoders. * utf8_encode * utf16_encode * utf32_encode But there are 13 encoders in my PEP: PyUnicode_Encode() PyUnicode_EncodeASCII() PyUnicode_EncodeLatin1() PyUnicode_EncodeUTF7() PyUnicode_EncodeUTF8() PyUnicode_EncodeUTF16() PyUnicode_EncodeUTF32() PyUnicode_EncodeUnicodeEscape() PyUnicode_EncodeRawUnicodeEscape() PyUnicode_EncodeCharmap() PyUnicode_TranslateCharmap() PyUnicode_EncodeDecimal() PyUnicode_TransformDecimalToASCII() Do you want to keep all encoders? or 3 encoders?
That would keep extensions working after a recompile, since Py_UNICODE is already a typedef to wchar_t.
That idea is written in the PEP already. https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t Regards, -- Inada Naoki <songofacandy@gmail.com>