[Python-Dev] Internal representation of strings and Micropython

Jeff Allen ja.py at farowl.co.uk
Wed Jun 4 09:41:12 CEST 2014


Jython uses UTF-16 internally -- probably the only sensible choice in a 
Python that can call Java. Indexing is O(N), fundamentally. By 
"fundamentally", I mean for those strings that have not yet noticed that 
they contain no supplementary (>0xffff) characters.

I've toyed with making this O(1) universally. Like Steven, I understand 
this to be a freedom afforded to implementers, rather than an issue of 
conformity.

Jeff Allen

On 04/06/2014 02:17, Steven D'Aprano wrote:
> There is a discussion over at MicroPython about the internal
> representation of Unicode strings.
...
> My own feeling is that O(1) string indexing operations are a quality of
> implementation issue, not a deal breaker to call it a Python. I can't
> see any requirement in the docs that str[n] must take O(1) time, but
> perhaps I have missed something.
>



More information about the Python-Dev mailing list