[Python-ideas] π = math.pi

Thomas Jollans tjol at tjol.eu
Sat Jun 3 19:28:42 EDT 2017


On 04/06/17 00:04, Chris Angelico wrote:
> On Sun, Jun 4, 2017 at 5:02 AM, Thomas Jollans <tjol at tjol.eu> wrote:
>> On 03/06/17 20:41, Chris Angelico wrote:
>>> [snip]
>>> For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So,
>>> but only these characters are valid from them:
>>>
>>> \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA
>>> \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA
>>> ℘ Sm SCRIPT CAPITAL P
>>> ℮ So ESTIMATED SYMBOL
>>>
>>> 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in
>>> PropList.txt as Other_ID_Start, so they make sense. But that doesn't
>>> explain the two characters from category Mn. It also doesn't explain
>>> why U+309B and U+309C are *not* valid, despite being declared
>>> Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got
>>> switched into 1885 and 1886??
>> \u1885 and \u1886 are categorised as letters (category Lo) by my Python
>> 3.5. (Which makes sense, right?) If your system puts them in category
>> Mn, that's bound to be a bug somewhere.
> rosuav at sikorsky:~$ python3.7 -c "import unicodedata;
> print(unicodedata.unidata_version, unicodedata.category('\u1885'))"
> 9.0.0 Mn
> rosuav at sikorsky:~$ python3.6 -c "import unicodedata;
> print(unicodedata.unidata_version, unicodedata.category('\u1885'))"
> 8.0.0 Lo
> rosuav at sikorsky:~$ python3.5 -c "import unicodedata;
> print(unicodedata.unidata_version, unicodedata.category('\u1885'))"
> 8.0.0 Lo
> rosuav at sikorsky:~$ python3.4 -c "import unicodedata;
> print(unicodedata.unidata_version, unicodedata.category('\u1885'))"
> 6.3.0 Lo
>
> Is it possible that there's a discrepancy between the Unicode version
> used by the unicodedata module and the one used by the parser?

It appear to be Unicode policy to keep characters in ID_Start (etc) even
if this no longer fits their character category. So in Unicode 9.0, 1885
and 1886 were added to Other_ID_Start for backwards compatibility (like ℘).


Thomas





More information about the Python-ideas mailing list