It's possible to do much better than this when defining a specialized variable-width string dtype. E.g. make the itemsize 8 bytes (like an object array, assuming a 64 bit system), but then for strings that can be encoded in 7 bytes or less of utf8 store them directly in the array; else store a pointer to a raw utf8 string on the heap. (Possibly with a reference count - there are some interesting tradeoffs there. I suspect 1-byte reference counts might be the way to go; if a logical copy would make it overflow then make an actual copy instead.) Anything involving the heap is going to have some overhead, but we don't need full fledged Python objects and once we give up mmap compatibility then there's a lot of room to tune.
-n