[Python-Dev] New Py_UNICODE doc

"Martin v. Löwis" martin at v.loewis.de
Sat May 7 15:34:27 CEST 2005


Shane Hathaway wrote:
> I agree that UCS4 is needed.  There is a balancing act here; UTF-16 is
> widely used and takes less space, while UCS4 is easier to treat as an
> array of characters.  Maybe we can have both: unicode objects start with
> an internal representation in UTF-16, but get promoted automatically to
> UCS4 when you index or slice them.  The difference will not be visible
> to Python code.  A compile-time switch will not be necessary.  What do
> you think?

This breaks backwards compatibility with existing extension modules.
Applications that do PyUnicode_AsUnicode get a Py_UNICODE*, and
can use that to directly access the characters.

Regards,
Martin


More information about the Python-Dev mailing list