Problems in Using C-API for Unicode handling

Stefan Behnel stefan_ml at behnel.de
Tue Jan 13 11:52:25 EST 2009


abhi wrote:
> Now I want to utf-16 so I am trying to use the first one, but it is
> giving back NULL in case of PyObject is already Unicode type which is
> expected. What puzzles me is that PyUnicode_FromObject(PyObject *obj)
> is passing irrespective of type of PyObject. The API says it is
> Shortcut for PyUnicode_FromEncodedObject(obj, NULL, "strict") but if I
> use that, it returns NULL where as PyUnicode_FromObject works.
> 
> Is there any way by which I can take in any PyObject and convert it to
> utf-16 object? Any help is appreciated.

Use PyUnicode_FromObject() to convert the (non-string) object to a unicode
object, then encode the unicode object as UTF-16 using the respecive
functions in the codecs API (see the bottom of the C-API docs page for the
unicode object).

Note, however, that you will not succeed to convert a byte string to the
corresponding unicode string using PyUnicode_FromObject(), except in the
simple case where the string is ASCII encoded. Doing this right requires
explicit decoding using a byte encoding that you must specify (again, see
the codecs API).

Stefan



More information about the Python-list mailing list