[issue10459] missing character names in unicodedata (CJK...)

Marc-Andre Lemburg report at bugs.python.org
Fri Nov 19 16:29:50 CET 2010


Marc-Andre Lemburg <mal at egenix.com> added the comment:

Vlastimil Brom wrote:
> 
> New submission from Vlastimil Brom <vlastimil.brom at gmail.com>:
> 
> I just noticed an ommision of come character names in unicodedata module.
> These are some CJK - Ideographs:
> 
> 龼 (0x9fbc) - 鿋 (0x9fcb)
>  (CJK Unified Ideographs [19968-40959] [0x4e00-0x9fff])
> 
> 𪜀 (0x2a700) - 𫜴 (0x2b734)
> (CJK Unified Ideographs Extension C [173824-177983] [0x2a700-0x2b73f])
> 
> 𫝀 (0x2b740) - 𫠝 (0x2b81d)
>  (CJK Unified Ideographs Extension D [177984-178207] [0x2b740-0x2b81f])
> 
> The names are probably to be generated - e.g. CJK UNIFIED IDEOGRAPH-2A700 ... etc.

I don't think we should fill those rather big ranges with generated
names, unless there's a standard for this. There are quite a
few ranges in the Unicode database that are assigned, but don't
have a literal name associated with them.

----------
nosy: +lemburg

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10459>
_______________________________________


More information about the Python-bugs-list mailing list