21 Apr
2017
21 Apr
'17
4:01 a.m.
I suggest a new data type 'text[encoding]', 'T'.
I like the suggestion very much (it is even in between S and U!). The utf-8 manifesto linked to above convinced me that the number that should follow is the number of bytes, which is nicely consistent with use in all numerical dtypes. Any way, more specifically on Julian's question: it seems to me one has little choice but to make a new dtype (and OK if that makes unicode obsolete). I think what exact encodings to support is a separate question. -- Marten