[Python-Dev] Odd lines in unicodedata_db.h
Stephen J. Turnbull
stephen at xemacs.org
Sun Apr 4 12:59:14 CEST 2010
Amaury Forgeot d'Arc writes:
> I don't think so. Unicode 3.2 did contain two entries with large
> numeric values. The file Unihan-3.2.0.txt contains these two
> lines:
>
> U+4EAC kPrimaryNumeric 10,000,000,000,000,000 ten quadrillion (American)
> U+5793 kPrimaryNumeric 100,000,000,000,000,000,000 hundred quintillion
> (American)
They are related to the Chinese numbering system. I recall U+4EAC
having that value from my textbooks (it's the "kyo" in Tokyo, and the
"jing" in "Beijing", so quite memorable), and U+5793 looks familiar
(it's not otherwise used in Japanese AFAIK, so I'm not sure, but it
seems quite plausible that there would be a character for 10000^5).
> For some reason newer versions of the unicode standard removed
> these values.
The characters are still there. The numeric values were probably
removed because in practice they're not actually used (at least,
almost never in Japanese). It seems a little sad to save 150 bytes or
so in a table and lose the historical meanings.
More information about the Python-Dev
mailing list