On 30 Jun 2020, at 13:43, Emily Bowman <silverbacknet@gmail.com> wrote:

I completely agree with this, that UTF-8 has become the One True Encoding(tm), and UCS-2 and UTF-16 are hardly found anywhere outside of the Win32 API. Nearly all basic emoji can't be represented in UCS-2 wchar_t, let alone composite emoji.


I use UCS-32 in my extensions, but never persist UCS-32 for which I use UTF-8.

If you are calling WIN32 "unicode" APIs then you need UCS-16.

My plan with PyCXX is to replace Py_UNICODE with UCS-32.
I think all the UCS-32 APIs will still be present.

Once I add that support to PyCXX all my users should easily port to a non-Py_UNICODE world.

Barry