[Python-Dev] Internal representation of strings and Micropython

Donald Stufft donald at stufft.io
Wed Jun 4 03:46:22 CEST 2014


I think UTF8 is the best option. 

> On Jun 3, 2014, at 9:17 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> There is a discussion over at MicroPython about the internal 
> representation of Unicode strings. Micropython is aimed at embedded 
> devices, and so minimizing memory use is important, possibly even 
> more important than performance.
> 
> (I'm not speaking on their behalf, just commenting as an interested 
> outsider.)
> 
> At the moment, their Unicode support is patchy. They are talking about 
> either:
> 
> * Having a build-time option to restrict all strings to ASCII-only.
> 
>  (I think what they mean by that is that strings will be like Python 2 
>  strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
> 
> * Implementing Unicode internally as UTF-8, and giving up O(1) 
>  indexing operations.
> 
> https://github.com/micropython/micropython/issues/657
> 
> 
> Would either of these trade-offs be acceptable while still claiming 
> "Python 3.4 compatibility"?
> 
> My own feeling is that O(1) string indexing operations are a quality of 
> implementation issue, not a deal breaker to call it a Python. I can't 
> see any requirement in the docs that str[n] must take O(1) time, but 
> perhaps I have missed something.
> 
> 
> 
> 
> -- 
> Steven
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io


More information about the Python-Dev mailing list