[Numpy-discussion] proposal: smaller representation of string arrays

Chris Barker chris.barker at noaa.gov
Thu Apr 20 13:46:31 EDT 2017


On Thu, Apr 20, 2017 at 10:36 AM, Neal Becker <ndbecker2 at gmail.com> wrote:

> I'm no unicode expert, but can't we truncate unicode strings so that only
> valid characters are included?
>

sure -- it's just a bit fiddly -- and you need to make sure that everything
gets passed through the proper mechanism. numpy is all about folks using
other code to mess with the bytes in a numpy array. so we can't expect that
all numpy string arrays will have been created with numpy code.

Does python's string have a truncated encode option? i.e. you don't want to
encode to utf-8 and then just chop it off.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170420/e39cb7e3/attachment.html>


More information about the NumPy-Discussion mailing list