[Python-Dev] HAVE_USABLE_WCHAR_T
M.-A. Lemburg
mal at egenix.com
Fri Oct 8 21:30:11 CEST 2004
Thomas Heller wrote:
> The Include/unicodeobject.h file says (line 103):
>
> /* If the compiler provides a wchar_t type we try to support it
> through the interface functions PyUnicode_FromWideChar() and
> PyUnicode_AsWideChar(). */
>
> This isn't true - grepping the CVS sources for this symbol shows that it
> is used in these ways:
>
> - When defined together with the WANT_WCTYPE_FUNCTIONS symbol, the
> compiler's wctype.h functions are used instead of the ones supplied with
> Python. Include/unicodeobject.h, line 294.
>
> - When defined together with MS_WINDOWS, it makes available mbcs_enocde
> and mbcs_decode functions (in Modules/_codecsmodule.c), plus the
> PyUnicode_DecodeMBCS and PyUnicode_AsMBCSString functions in
> Objects/unicodeobject.c.
>
> - Contrary to the comment at the top of this message, the
> PyUnicode_FromWideChar and PyUnicode_AsWideChar functions are compiled
> when HAVE_WCHAR_H is defined. The HAVE_USABLE_WCHAR_T symbol is only
> used to determine whether memcpy is used, or the unicode characters are
> copied one by one.
>
> - Finally, again when defined together with MS_WINDOWS, it sets the
> filesystem encoding to mbcs.
>
>
> So, it seems that the HAVE_USABLE_WCHAR_T symbol doesn't play any role
> for the extension programmer *at all*.
That symbol is defined by the configure script for use in the
interpreter - why did you think it is usable for extensions ?
The HAVE_USABLE_WCHAR_T symbol only means that we can use wchar_t
as synonym for Py_UNICODE and thus makes some APIs
more efficient, e.g. on Windows - nothing more.
> The preprocessor flag that plays
> a role for extensions seem to be HAVE_WCHAR_H since they mark whether
> the PyUnicode_FromWideChar and PyUnicode_AsWideChar are available or
> not.
Right, since wchar.h is the include file that defines the
wchar_t type.
> This has caused me quite some confusion, and so I suggest the comment
> above in the Include/unicodeobject.h file should be fixed.
>
> Finally, the docs also seem to get it wrong (although I haven't followed
> that in detail). Can't reach python.org at the moment, but Python C/api
> manual, section 7.3.2, unicode objects says:
>
> Py_UNICODE
>
> This type represents a 16-bit unsigned storage type which is used by
> Python internally as basis for holding Unicode ordinals. On platforms
> where wchar_t is available and also has 16-bits, Py_UNICODE is a
> typedef alias for wchar_t to enhance native platform compatibility. On
> all other platforms, Py_UNICODE is a typedef alias for unsigned short.
>
> Isn't the size 32 bits for wide unicode builds?
Yes.
> Please, please fix this - unicode is already complicated enough even
> without this confusion!
Please add a bug report to SF for this.
Thanks,
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Oct 08 2004)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
More information about the Python-Dev
mailing list