[New-bugs-announce] [issue36486] Bugs and inconsistencies in unicodedata
David Corbett
report at bugs.python.org
Sat Mar 30 10:41:13 EDT 2019
New submission from David Corbett <corbett.dav at husky.neu.edu>:
In `unicodedata`, the functions `lookup` and `name` have some bugs and inconsistencies.
`lookup` matches case-insensitively, except for the algorithmic names of Hangul syllables and CJK unified ideographs, which must be in all caps. The documentation does not explain how character names are fuzzily matched.
`lookup` accepts names like “CJK UNIFIED IDEOGRAPH-04E00”, where the code point has a leading zero.
`lookup` and `name` don’t implement rule NR2, defined in chapter 4 of Unicode, for Tangut ideographs’ names.
----------
assignee: docs at python
components: Documentation, Unicode
messages: 339203
nosy: docs at python, dscorbett, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: Bugs and inconsistencies in unicodedata
type: behavior
versions: Python 3.7
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36486>
_______________________________________
More information about the New-bugs-announce
mailing list