[Python-3000] Array typecode 'w' vs. 'u' and UCS4 builds
Travis E. Oliphant
oliphant at enthought.com
Fri Oct 12 19:52:27 CEST 2007
Christian Heimes wrote:
> Yesterday I found a design problem in the array module. Travis Oliphant
> added a new typecode 'w' to the array module. 'w' is a wide unicode type
> that is guaranteed to be at least 4 bytes long. The 'u' typecode may be
> 2 bytes long.
>
> Unfortunately his change removed 'u' as a possible typecode which makes
> it unnecessary hard to write code that works on Windows (UCS2 only) and
> Unix (UCS4 for most Linux distributions). I've written a patch that
> keeps 'u' in every build and adds 'w' as an alias for 'u' in UCS-4
> builds only. It also introduces the new module variable typecodes
> which is a unicode string containing all valid typecodes.
>
The problem is to keep the array typecodes somewhat consistent with the
typecodes in PEP 3118 which will be in the struct module.
How about making 'U' be the typecode that translates to 'u' or 'w'
depending on the platform and supporting both 'u' and 'w' on all
platforms by appropriate translation of bytes on getting and setting?
-Travis
More information about the Python-3000
mailing list