[Numpy-discussion] One-byte string dtype: third time's the charm?

Nathaniel Smith njs at pobox.com
Sun Feb 22 14:57:35 EST 2015


On Sun, Feb 22, 2015 at 11:29 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
> On 22/02/15 19:21, Aldcroft, Thomas wrote:
>
>> Problems like this are now showing up in the wild [3].  Workarounds are
>> also showing up, like a way to easily convert from 'S' to 'U' within
>> astropy Tables [4], but this is really not a desirable way to go.
>> Gigabyte-sized string data arrays are not uncommon, so converting to
>> UCS-4 is a real memory and performance hit.
>
> Why UCS-4? The Python's internal "flexible string respresentation" will
> use ascii for ascii text.

This is a discussion about how strings are represented as bit-patterns
inside ndarrays; the internal storage representation used by 'str' is
irrelevant.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org



More information about the NumPy-Discussion mailing list