2) There be some way to store mostly ascii-compatible strings in a single byte-per-character array -- so not to be wasting space for "typical european-language-oriented data". Note: this should ALSO be compatible with Python's character-oriented string model. i.e. a Python String with length N will fit into a dtype of size N.arr = np.array(("this", "that",), dtype=np.single_byte_string)
(name TBD)
and arr[1] would return a python string.attempting to put in a not-compatible with the encoding String would raise an EncodingError.
This is also a use-case primarily for "casual" users -- but ones concerned with the size of the data storage and know that are using european text.
3) dtypes that support storage in particular encodings:
4) a fixed length bytes dtype -- pretty much what 'S' is now under python three -- settable from a bytes or bytearray object (or other memoryview?), and returns a bytes object.You could use astype() to convert between bytes and a specified encoding with no change in binary representation. This could be used to store any binary data, including encoded text or anything else. this should map directly to the Python bytes model -- thus NOT null-terminted.This is a little different than 'S' behaviour on py3 -- it appears that with 'S', a if ALL the trailing bytes are null, then it is truncated, but if there is a null byte in the middle, then it is preserved. I suspect that this is a legacy from Py2's use of "strings" as both text and binary data. But in py3, a "bytes" type should be about bytes, and not text, and thus null-values bytes are simply another value a byte can hold.