[Python-Dev] Py_UNICODE madness

Guido van Rossum gvanrossum at gmail.com
Wed May 4 00:44:13 CEST 2005


I think that documentation is wrong; AFAIK Py_UNICODE has always been
allowed to be either 16 or 32 bits, and the source code goes through
great lengths to make sure that you get a link error if you try to
combine extensions built with different assumptions about its size.

On 5/3/05, Nicholas Bastin <nbastin at opnet.com> wrote:
> The documentation for Py_UNICODE states the following:
> 
> "This type represents a 16-bit unsigned storage type which is used by
> Python internally as basis for holding Unicode ordinals. On platforms
> where wchar_t is available and also has 16-bits,  Py_UNICODE is a
> typedef alias for wchar_t to enhance  native platform compatibility. On
> all other platforms,  Py_UNICODE is a typedef alias for unsigned
> short."
> 
> However, we have found this not to be true on at least certain RedHat
> versions (maybe all, but I'm not willing to say that at this point).
> pyconfig.h on these systems reports that PY_UNICODE_TYPE is wchar_t,
> and PY_UNICODE_SIZE is 4.  Needless to say, this isn't consistent with
> the docs.  It also creates quite a few problems when attempting to
> interface Python with other libraries which produce unicode data.
> 
> Is this a bug, or is this behaviour intended?
> 
> It turns out that at some point in the past, this created problems for
> tkinter as well, so someone just changed the internal unicode
> representation in tkinter to be 4 bytes as well, rather than tracking
> down the real source of the problem.
> 
> Is PY_UNICODE_TYPE always going to be guaranteed to be 16 bits, or is
> it dependent on your platform? (in which case we can give up now on
> Python unicode compatibility with any other libraries).  At the very
> least, if we can't guarantee the internal representation, then the
> PyUnicode_FromUnicode API needs to go away, and be replaced with
> something capable of transcoding various unicode inputs into the
> internal python representation.
> 
> --
> Nick
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list