RE Module Performance

Chris Angelico rosuav at gmail.com
Wed Jul 24 16:56:27 CEST 2013


On Thu, Jul 25, 2013 at 12:47 AM, Michael Torrie <torriem at gmail.com> wrote:
> On 07/24/2013 07:40 AM, wxjmfauth at gmail.com wrote:
>> Sorry, you are not understanding Unicode. What is a Unicode
>> Transformation Format (UTF), what is the goal of a UTF and
>> why it is important for an implementation to work with a UTF.
>
> Really?  Enlighten me.
>
> Personally, I would never use UTF as a representation *in memory* for a
> unicode string if it were up to me.  Why?  Because UTF characters are
> not uniform in byte width so accessing positions within the string is
> terribly slow and has to always be done by starting at the beginning of
> the string.  That's at minimum O(n) compared to FSR's O(1).  Surely you
> understand this.  Do you dispute this fact?

Take care here; UTF is a general term for Unicode Translation Formats,
of which one (UTF-32) is fixed-width. Every other UTF-n is variable
width, though, so your point still stands. UTF-32 is the basis for
Python's FSR.

ChrisA



More information about the Python-list mailing list