[Numpy-discussion] Review of issue 825

Wed Jun 25 12:49:33 EDT 2008

On Wed, Jun 25, 2008 at 5:14 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> OK, the problem in the UNICODE_{get,set}item routines is converting between
> ucs4 and the encoding python is using, which may be ucs2.  But there is
> something strange if sparc is using ucs4 (Py_UNICODE_WIDE) and the pointer
> ip is aligned on two bytes instead of 4, that would seem to indicate a
> problem further up the call chain. Could you check that that is actually
> happening, i.e., ip is not 4 byte aligned and Py_UNICODE_WIDE is defined?

You need to keep the test case in the 1st comment of the issue in mind
here - the problem is extracting the unicode string for a dtype
specified as (unsigned char, unicode string). This is allocated as 5
bytes, and the string is not correctly aligned within these 5 bytes
for access via a long pointer, as is needed for the current check in
UNICODE_getitem to work.

-- 
Neil Muller
drnlmuller at gmail.com