On Mon, Dec 28, 2020 at 10:53 PM Antoine Pitrou
On Mon, 28 Dec 2020 11:07:46 +0900 Inada Naoki
wrote: Additionally, if we introduce the customizable lazy str object, it's very easy to release GIL during basic Unicode operations. Many third parties may assume PyUnicode_Compare doesn't release GIL if both operands are Unicode objects.
1) You have to prove such "many third parties" exist. I've written my share of C extension code and I don't remember assuming that PyUnicode_Compare doesn't release the GIL.
It is my fault that I said "many", but I just pointed out possible backward incompatibility. Why I have to prove it?
2) Even if there is such third party code, it is clearly making assumptions about undocumented implementation details. It is therefore ok to break it in new versions of CPython.
But it should be considered carefully, because these APIs are not releasing GIL for a long time. And this type of change do not cause just a simple crash, but very rare undefined behaviors in multithreaded complex applications. For example, borrowed references in the caller can be changed to other objects with same size because memory blocks are reused. It is very difficult to notice and reproduce.
However, I agree that having to call PyUnicode_READY() before calling C unicode APIs is probably an obscure detail that few people remember about.
If we provide custom callback and call it in PyUnicode_READY(), many
Unicode APIs using PyUnicode_READY() will be changed from predictable
behavior API to "may run arbitrary code" behavior. It is obscure
detail too.
Regards,
--
Inada Naoki