[Numpy-discussion] adding more unicode dtypes
jtaylor.debian at googlemail.com
Wed Jan 15 13:25:31 EST 2014
On 15.01.2014 18:57, Charles R Harris wrote:
> There was a discussion of this long ago and UCS-4 was chosen as the
> numpy standard. There are just too many complications that arise in
> supporting both.
my guess is that that discussion was before python3 and you could still
simply treat bytes == string?
In python3 you need extra code to deal with arrays containing strings as
the S type is interpreted as bytes which is not a string type anymore .
Someone on irc (I think Freddie Witherden CC'd) had a use case with huge
ascii tables in numpy which now have to be stored as 4 bytes unicode on
disk or decode bytes all the time.
I personally don't use strings in arrays so I can neither judge the
impact nor the use, but it seems to me like at least having an ascii
dtype for python2<->python3 compatibility would be useful.
More information about the NumPy-Discussion