Alexander Belopolsky wrote:
Two recently reported issues brought into light the fact that Python language definition is closely tied to character properties maintained by the Unicode Consortium. [1,2] For example, when Python switches to Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two additional characters that Python can use in identifiers. 
With Python 3.1:
exec('\u0CF1 = 1')
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1 ೱ = 1 ^ SyntaxError: invalid character in identifier
but with Python 3.2a4:
exec('\u0CF1 = 1') eval('\u0CF1')
Such changes are not new, but I agree that they should probably be highlighted in the "What's new in Python x.x".
Of course, the likelihood is low that this change will affect any user, but the change in str.isspace() reported in  is likely to cause some trouble:
That's a classical bug fix.
While we have little choice but to follow UCD in defining str.isidentifier(), I think Python can promise users more stability in what it treats as space or as a digit in its builtins.
Why should we divert from the work done by the Unicode Consortium ? After all, most of their changes are in fact bug fixes as well.
For example, I don't think that supporting
is more important than to assure users that once their program accepted some text as a number, they can assume that the text is ASCII.
Sorry, but I don't agree.
If ASCII numerals are an important aspect of an application, the application should make sure that only those numerals are used (e.g. by using a regular expression for checking).
In a Unicode world, not accepting non-Arabic numerals would be a limitation, not a feature. Besides Python has had this support since Python 1.6.