[Python-Dev] Internal representation of strings and Micropython
Steven D'Aprano
steve at pearwood.info
Wed Jun 4 03:17:18 CEST 2014
There is a discussion over at MicroPython about the internal
representation of Unicode strings. Micropython is aimed at embedded
devices, and so minimizing memory use is important, possibly even
more important than performance.
(I'm not speaking on their behalf, just commenting as an interested
outsider.)
At the moment, their Unicode support is patchy. They are talking about
either:
* Having a build-time option to restrict all strings to ASCII-only.
(I think what they mean by that is that strings will be like Python 2
strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
* Implementing Unicode internally as UTF-8, and giving up O(1)
indexing operations.
https://github.com/micropython/micropython/issues/657
Would either of these trade-offs be acceptable while still claiming
"Python 3.4 compatibility"?
My own feeling is that O(1) string indexing operations are a quality of
implementation issue, not a deal breaker to call it a Python. I can't
see any requirement in the docs that str[n] must take O(1) time, but
perhaps I have missed something.
--
Steven
More information about the Python-Dev
mailing list