[issue10459] missing character names in unicodedata (CJK...)
Marc-Andre Lemburg
report at bugs.python.org
Fri Nov 19 16:29:50 CET 2010
Marc-Andre Lemburg <mal at egenix.com> added the comment:
Vlastimil Brom wrote:
>
> New submission from Vlastimil Brom <vlastimil.brom at gmail.com>:
>
> I just noticed an ommision of come character names in unicodedata module.
> These are some CJK - Ideographs:
>
> 龼 (0x9fbc) - 鿋 (0x9fcb)
> (CJK Unified Ideographs [19968-40959] [0x4e00-0x9fff])
>
> 𪜀 (0x2a700) - 𫜴 (0x2b734)
> (CJK Unified Ideographs Extension C [173824-177983] [0x2a700-0x2b73f])
>
> 𫝀 (0x2b740) - 𫠝 (0x2b81d)
> (CJK Unified Ideographs Extension D [177984-178207] [0x2b740-0x2b81f])
>
> The names are probably to be generated - e.g. CJK UNIFIED IDEOGRAPH-2A700 ... etc.
I don't think we should fill those rather big ranges with generated
names, unless there's a standard for this. There are quite a
few ranges in the Unicode database that are assigned, but don't
have a literal name associated with them.
----------
nosy: +lemburg
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10459>
_______________________________________
More information about the Python-bugs-list
mailing list