[Python-Dev] Odd lines in unicodedata_db.h

Amaury Forgeot d'Arc amauryfa at gmail.com
Sun Apr 4 11:21:49 CEST 2010


2010/4/4 MRAB <python at mrabarnett.plus.com>:
> I've just downloaded the daily snapshot at
> http://svn.python.org/snapshots/python.tar.bz2
>
> In the header file /python/Modules/unicodedata_db.h, there are the
> following lines in the change_records_3_2_0 struct:
>
>        { 255, 255, 255, 255, 1.0 },
>        { 255, 255, 255, 255, 2.0 },
>        { 255, 255, 255, 255, 3.0 },
>        { 255, 255, 255, 255, 4.0 },
>        ...
>        { 255, 255, 255, 255, 1e+16 },
>        { 255, 255, 255, 255, 1e+20 },
>
> Looks like a bug to me.

I don't think so. Unicode 3.2 did contain two entries with large numeric values.
The file Unihan-3.2.0.txt contains these two lines:

U+4EAC	kPrimaryNumeric	10,000,000,000,000,000 ten quadrillion (American)
U+5793	kPrimaryNumeric	100,000,000,000,000,000,000 hundred quintillion
(American)

For some reason newer versions of the unicode standard removed these values.

-- 
Amaury Forgeot d'Arc


More information about the Python-Dev mailing list