[Python-3000] Array typecode 'w' vs. 'u' and UCS4 builds

Christian Heimes lists at cheimes.de
Fri Oct 12 20:14:41 CEST 2007


Travis E. Oliphant wrote:
> The problem is to keep the array typecodes somewhat consistent with the
> typecodes in PEP 3118 which will be in the struct module. 
> How about making 'U' be the typecode that translates to 'u' or 'w'
> depending on the platform and supporting both 'u' and 'w' on all
> platforms by appropriate translation of bytes on getting and setting?

Now I see your point. :) Your solution sounds feasible but is it
realizable on all platforms? I once hit a thick wall of bricks during my
work on PythonNET. I tried to make it compatible with Mono and UCS-4
builds of Python but it was really hard because the .NET standards don't
care about anything else than a 16bit wchar_t which doesn't even
translate to UTF-16. I fear that 'w' may hit a similar wall on Windows.

Should PEP 3118 and the array module have a 'U' typecode, too? It may
proof useful for platform and build independent software to have a
typecode that translates to the native unicode type (UCS-2 or UCS-4).

Christian


More information about the Python-3000 mailing list