
On Mon, Nov 2, 2009 at 9:45 AM, Daniel Stutzbach <daniel@stutzbachenterprises.com> wrote:
On Mon, Nov 2, 2009 at 11:34 AM, Guido van Rossum <guido@python.org> wrote:
This sounds attractive, but I kind of doubt that changing a single API is sufficient. Perhaps it would be useful to do a kind of review or survey of how many Unicode APIs are used by the typical extension?
I made an editing error. I meant to suggest altering all the PyUnicode_* macro/functions, except those that explicitly use Py_UNICODE or PyUnicodeObject in their signature. PyUnicode_FromString was just an example.
We'd also have to hide the macros that can be used to access the internals of a PyUnicodeObject, in order for that approach to be safe. Basically, an extension would have to include a second header file to use those macros and it would have to somehow indicate to the linker that it is using UCS2 or UCS4 internals as well. I would want to err on the safe side here -- if it was at all easy to create an extension that *seems* to be ABI-neutral but *actually* relies on knowledge about the UCS2 or UCS4 representation, we'd be creating a worse problem. Users don't like stuff not working, but they *really* don't like stuff crashing with random core dumps -- if it has to be broken, let it break very loudly and explicitly. The current approach satisfies that requirement -- it probably just errs too far on the "never assume it might work" side. -- --Guido van Rossum (python.org/~guido)