[New-bugs-announce] [issue9979] Create PyUnicode_AsWideCharString() function

STINNER Victor report at bugs.python.org
Wed Sep 29 02:20:50 CEST 2010

New submission from STINNER Victor <victor.stinner at haypocalc.com>:

PyUnicode_AsWideChar() doesn't merge surrogate pairs on a system with 32 bits wchar_t and Python compiled in narrow mode (sizeof(wchar_t) == 4 and sizeof(Py_UNICODE) == 2) => see issue #8670.

It is not easy to fix this problem because the callers of PyUnicode_AsWideChar() suppose that the output (wide character) string has the same length (in character) than the input (PyUnicode) string (suppose that sizeof(wchar_t) == sizeof(Py_UNICODE)). And PyUnicode_AsWideChar() doesn't write nul character at the end if the output string is truncated.

To prepare this change, a new PyUnicode_AsWideCharString() function would help because it does compute the size of the output buffer (whereas PyUnicode_AsWideChar() requires the output buffer in an argument).

Attached patch implements it:
/* Convert the Unicode object to a wide character string. The output string
   always ends with a nul character. If size is not NULL, write the number of
   wide characters (including the final nul character) into *size.

   Returns a buffer allocated by PyMem_Alloc() (use PyMem_Free() to free it) on
   success. On error, returns NULL and *size is undefined. */

PyAPI_FUNC(wchar_t*) PyUnicode_AsWideCharString(
    PyUnicodeObject *unicode,   /* Unicode object */
    Py_ssize_t *size            /* number of characters of the result */

components: Interpreter Core, Unicode
files: pyunicode_aswidecharstring.patch
keywords: patch
messages: 117566
nosy: haypo
priority: normal
severity: normal
status: open
title: Create PyUnicode_AsWideCharString() function
versions: Python 3.2
Added file: http://bugs.python.org/file19054/pyunicode_aswidecharstring.patch

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list