[Python-3000] string C API

Josiah Carlson jcarlson at uci.edu
Fri Sep 15 23:16:57 CEST 2006


"Paul Prescod" <paul at prescod.net> wrote:
[snip]
> The result seems obvious to me...8-bit-fixed encodings are a terrible idea
> and need to just go away. Let's not build them into Python's core on the
> basis of a minor and fleeting performance improvement.

Variable-width encodings make many operations difficult, not the least
of which being "what is the code point for the ith character?"  The
benefit of going with a fixed-width encoding (like Python currently does
for unicode objects with UCS-2) is that so many computations are merely
an iteration over a sequence of chars/shorts/ints.  No need to recode
for complicated operations, no need to understand utf-8 for string
operations, etc.


 - Josiah



More information about the Python-3000 mailing list