Article on the future of Python
Paul Rubin
no.email at nospam.invalid
Wed Sep 26 13:32:40 EDT 2012
Chris Angelico <rosuav at gmail.com> writes:
> So, I don't actually have any stats for you, because it's really easy
> to just not index strings at all.
Right, that's why I think the O(n) indexing issue of UTF-8 may be
overblown. Haskell 98 was mentioned earlier as a language that did
Unicode "correctly", but its strings are linked lists of code points.
They are a performance pig to be sure but the O(n) indexing is usually
not the bottleneck. These days there is a "Text" module that I think is
basically UTF-16 arrays. I have been meaning to find out what happens
with non-BMP characters.
More information about the Python-list
mailing list