Internal Format (Re: [Python-Dev] Internationalization Toolkit)

Fredrik Lundh fredrik@pythonware.com
Wed, 10 Nov 1999 13:32:16 +0100


> What I don't like is using wchar_t if available (and then addressing
> it as if it were defined as unsigned integer). IMO, it's better
> to define a Python Unicode representation which then gets converted
> to whatever wchar_t represents on the target machine.

you should read the unicode.h file a bit more carefully:

...

/* Unicode declarations. Tweak these to match your platform */

/* set this flag if the platform has "wchar.h", "wctype.h" and the
   wchar_t type is a 16-bit unsigned type */
#define HAVE_USABLE_WCHAR_H

#if defined(WIN32) || defined(HAVE_USABLE_WCHAR_H)

    (this uses wchar_t, and also iswspace and friends)

...

#else

/* Use if you have a standard ANSI compiler, without wchar_t support.
   If a short is not 16 bits on your platform, you have to fix the
   typedef below, or the module initialization code will complain. */

    (this maps iswspace to isspace, for 8-bit characters).

#endif

...

the plan was to use the second solution (using "configure"
to figure out what integer type to use), and its own uni-
code database table for the is/to primitives

(iirc, the unicode.txt file discussed this, but that one
seems to be missing from the zip archive).

</F>