[I18n-sig] Re: [Python-Dev] Unicode debate

Just van Rossum just@letterror.com
Wed, 3 May 2000 08:50:11 +0100

> I just wanted to point out that the argument "slicing doesn't
> work with UTF-8" is moot.

> And failed...

>He succeeded for me.  Blind slicing doesn't always "work right" no matter
>what encoding you use, because "work right" depends on semantics beyond the
>level of encoding.  UTF-8 is no worse than anything else in this respect.

But the discussion *was* at the level of encoding! Still it is worse, since
an arbitrary utf-8 slice may result in two illegal strings -- slicing "e`"
results in two perfectly legal strings, at the encoding level. Had he used
surrogates as an example, he would've been right... (But even that is an
encoding issue.)