[Python-Dev] Unicode <--> UTF-8 in CPython extension modules

Colin Walters walters at verbum.org
Sat Feb 23 00:46:38 CET 2008


On Fri, Feb 22, 2008 at 4:23 PM, John Dennis <jdennis at redhat.com> wrote:

>  Python programs which use Unicode string objects for their i18n and
>  which "link" to C libraries expecting UTF-8 but which have a CPython
>  binding which only uses 's' or 's#' formats programs seem to often
>  fail with encoding errors.

One thing to be aware of is that PyGTK+ actually sets the Python
Unicode object encoding to UTF-8.

http://bugzilla.gnome.org/show_bug.cgi?id=132040

I mention this because PyGTK is a very popular library related to
Python and Linux.  So currently if you "import gtk", then libraries
which are using UTF-8 (as you say, the vast majority) will work with
Python unicode objects unmodified.


More information about the Python-Dev mailing list