
On Thu, Jul 9, 2020 at 10:13 PM Jim J. Jewett <jimjjewett@gmail.com> wrote:
Unless I'm missing something, part of M.-A. Lemburg's objection is:
1. The wchar_t type is itself an important interoperability story in C. (I'm not sure if this includes the ability, at compile time, to define wchar_t as either of two widths.)
Of course. But wchar_t* is not the only way to use Unicode in C. UTF-8 is the most common way to use Unicode in C in recent days. (except Java, .NET, and Windows API) So the importance of wchar_t* APIs are relative, not absolute. In other words, why don't we have an encode API with direct UTF-8 input? Is there any evidence wchar_t* is much more important than UTF-8?
2. The ability to work directly with wchar_t without a round-trip in/out of python format is an important feature that CPython has provided for C integrators.
Note that current API *does* the round-trip: For example: https://github.com/python/cpython/blob/61bb24a270d15106decb1c7983bf4c2831671... Users can not use the API without initializing Python VM. Users can not avoid time and space for the round-trip. So removing these APIs doesn't reduce any ability.
3. The above support can be kept even without the wchar_t* member ... so saving the extra space on each string instance does not require dropping this support.
This is why I split PEP 623 and PEP 624. I never said removing the wchar_t* member is motivation for PEP 624. Regards, -- Inada Naoki <songofacandy@gmail.com>