[Numpy-discussion] Extent of unicode types in numpy

Travis Oliphant oliphant at ee.byu.edu
Mon Feb 6 11:17:00 EST 2006


Francesc Altet wrote:

>Hi,
>
>I'm a bit surprised by the fact that unicode types are the only ones
>breaking the rule that must be specified with a different number of
>bytes than it really takes. For example:
>  
>
Yeah,  it's a bit annoying.  There are special checks throughout the 
code for this.  The problem, though is that sizeof(Py_UNICODE) can be 4 
or 2 depending on how Python was compiled.

Also, Python treats unicode and string characters as having the same 
length (even though internally, there is a different number of bytes 
required). 

So, I'm not sure exactly what to do, short of introducing a new code for 
"Unicode with specific number of bytes."

I think the inconsistency should be removed, though.  I'm just not sure 
how to do it.

-Travis






More information about the NumPy-Discussion mailing list