[Python-Dev] UCS2/UCS4 default

Terry Reedy tjreedy at udel.edu
Thu Jul 3 23:01:48 CEST 2008

Guido van Rossum wrote:
> On Thu, Jul 3, 2008 at 10:44 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>> The premise of this thread seems to be that the majority should suffer for
>> the benefit of a few.  That is not Python's philosophy.

The premise is the OP's idea that Python should switch to all UCS4 to 
create a more pure ('ideal') situation or the idea that len(s) should 
count codepoints (correct term?) for all builds as a matter of purity 
even though on it would be time-costly on 16-bit builds as a matter of 

> Who are the many here?

Those who are happy with 3.0 strings as they are for their systems and 
who would not benefit from the proposed change.  In other words, what 
you say below.

 > Who are the few?

Those who are stuck with 16-bit builds and who would benefit from 
32-bits builds because they need to use non basic plane chars and need 
to use the operations for which a change would make a positive difference.

In my opinion, such people with Windows should at least install Linux + 
UCS4 Python as an alternate install.

> I'd venture that (at least for
> the foreseeable future, say, until China will finally have taken over
> the role of the US as the de-facto dominant super power :-) the many
> are people whose app will never see a Unicode character outside the
> BMP, or who do such minimal string processing that their code doesn't
> care whether it's handling UTF-16-encoded data.

Just what I meant.

> Python's philosophy is also Practicality Beats Purity.

Just what I meant, in the form 'Purity does not beat Practicality'.

Having summarized, perhaps too briefly, why Python's basic unicode 
implementation would not change in the near future, I went on to my main 
point, which is that better docs might be an alternative solution to the 
problems raised.


More information about the Python-Dev mailing list