[Python-ideas] UCS2 vs UCS4 ABIs

Guido van Rossum guido at python.org
Mon Nov 2 18:57:52 CET 2009


On Mon, Nov 2, 2009 at 9:45 AM, Daniel Stutzbach
<daniel at stutzbachenterprises.com> wrote:
> On Mon, Nov 2, 2009 at 11:34 AM, Guido van Rossum <guido at python.org> wrote:
>>
>> This sounds attractive, but I kind of doubt that changing a single API
>> is sufficient. Perhaps it would be useful to do a kind of review or
>> survey of how many Unicode APIs are used by the typical extension?
>
> I made an editing error.  I meant to suggest altering all the PyUnicode_*
> macro/functions, except those that explicitly use Py_UNICODE or
> PyUnicodeObject in their signature.  PyUnicode_FromString was just an
> example.

We'd also have to hide the macros that can be used to access the
internals of a PyUnicodeObject, in order for that approach to be safe.
Basically, an extension would have to include a second header file to
use those macros and it would have to somehow indicate to the linker
that it is using UCS2 or UCS4 internals as well.

I would want to err on the safe side here -- if it was at all easy to
create an extension that *seems* to be ABI-neutral but *actually*
relies on knowledge about the UCS2 or UCS4 representation, we'd be
creating a worse problem. Users don't like stuff not working, but they
*really* don't like stuff crashing with random core dumps -- if it has
to be broken, let it break very loudly and explicitly. The current
approach satisfies that requirement -- it probably just errs too far
on the "never assume it might work" side.

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list