[I18n-sig] Unicode surrogates: just say no!

Paul Prescod paulp@ActiveState.com
Thu, 28 Jun 2001 11:11:59 -0700

"Martin v. Loewis" wrote:
> The rationale for supporting \U is two-fold: One, importing a module
> should not fail in one installation, and succeed in another (of the
> same Python version). Running the module may give different results,
> but you should be able to generate byte code. 

Isn't it already the case that big Python integer literals can be legal
on one platform and illegal on another? (I don't know, I just thought
that was the case....)

> ... Furthermore, people
> using non-BMP characters in source are probably not very interested in
> counting the characters: They want to display them. For just
> displaying them, you need to represent them, and you need the fonts.
> String manipulation is less important.

What are the chances that anybody is in this situation in the near
future? Can you even display these characters on Windows? Does Tk
support them? And if so, on what platforms? What about the Java APIs?
(once again, these are real, not rhetorical questions)

Wide Python builds may be the "default" before these characters become
practically usable in GUIs.
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook