2 Jul
2020
2 Jul
'20
5:53 a.m.
On 30 Jun 2020, at 13:43, Emily Bowman <silverbacknet@gmail.com> wrote:
I completely agree with this, that UTF-8 has become the One True Encoding(tm), and UCS-2 and UTF-16 are hardly found anywhere outside of the Win32 API. Nearly all basic emoji can't be represented in UCS-2 wchar_t, let alone composite emoji.
I use UCS-32 in my extensions, but never persist UCS-32 for which I use UTF-8. If you are calling WIN32 "unicode" APIs then you need UCS-16. My plan with PyCXX is to replace Py_UNICODE with UCS-32. I think all the UCS-32 APIs will still be present. Once I add that support to PyCXX all my users should easily port to a non-Py_UNICODE world. Barry