[I18n-sig] Re: [Python-Dev] Unicode debate

Just van Rossum just@letterror.com
Wed, 3 May 2000 12:41:27 +0100


At 10:15 AM +0200 03-05-2000, M.-A. Lemburg wrote:
>Huh ? The pure fact that you can have two (or more)
>Unicode characters to represent a single character makes
>Unicode itself have the same problems as e.g. UTF-8.

It's the different level of abstraction that makes it different.

Even if "e`" is _equivalent_ to the combined character, that doesn't mean
that it _is_ the combined character, on the level of abstraction we are
talking about: it's still 2 characters, and those can be sliced apart
without a problem. Slicing utf-8 doesn't work because it yields invalid
strings, slicing "e`" does work since both halves are valid strings. The
fact that "e`" is semantically equivalent to the combined character doesn't
change that.

Just