There is a discussion over at MicroPython about the internal representation of Unicode strings. Micropython is aimed at embedded devices, and so minimizing memory use is important, possibly even more important than performance.
(I'm not speaking on their behalf, just commenting as an interested outsider.)
At the moment, their Unicode support is patchy. They are talking about either:
* Having a build-time option to restrict all strings to ASCII-only.
(I think what they mean by that is that strings will be like Python 2 strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
* Implementing Unicode internally as UTF-8, and giving up O(1) indexing operations.
Would either of these trade-offs be acceptable while still claiming "Python 3.4 compatibility"?
My own feeling is that O(1) string indexing operations are a quality of implementation issue, not a deal breaker to call it a Python. I can't see any requirement in the docs that str[n] must take O(1) time, but perhaps I have missed something.