[I18n-sig] Unicode surrogates: just say no!

Guido van Rossum guido@digicool.com
Thu, 28 Jun 2001 15:37:43 -0400


> > The rationale for supporting \U is two-fold: One, importing a module
> > should not fail in one installation, and succeed in another (of the
> > same Python version). Running the module may give different results,
> > but you should be able to generate byte code. 
> 
> Isn't it already the case that big Python integer literals can be legal
> on one platform and illegal on another? (I don't know, I just thought
> that was the case....)

Yes, this is why the argument for \U as surrogate-generator is not so
strong.

> > ... Furthermore, people
> > using non-BMP characters in source are probably not very interested in
> > counting the characters: They want to display them. For just
> > displaying them, you need to represent them, and you need the fonts.
> > String manipulation is less important.
> 
> What are the chances that anybody is in this situation in the near
> future? Can you even display these characters on Windows? Does Tk
> support them? And if so, on what platforms? What about the Java APIs?
> (once again, these are real, not rhetorical questions)

I don't know the answers.

> Wide Python builds may be the "default" before these characters become
> practically usable in GUIs.

:-)

--Guido van Rossum (home page: http://www.python.org/~guido/)