[issue36417] unicode.isdecimal bug in online Python 2 documentation
New submission from PEW's Corner <pewscorner@gmail.com>: The online Python 2 documentation for unicode.isdecimal (https://docs.python.org/2/library/stdtypes.html#unicode.isdecimal) incorrectly states: "Decimal characters include digit characters". This is wrong (decimal characters are actually a subset of digit characters), and u'\u00b3' is an example of a character that is a digit but not a decimal. Issue 26483 (https://bugs.python.org/issue26483) corrected the same bug in the Python 3 documentation, and a similar correction should be applied to the Python 2 documentation. ---------- assignee: docs@python components: Documentation messages: 338736 nosy: docs@python, pewscorner priority: normal severity: normal status: open title: unicode.isdecimal bug in online Python 2 documentation type: behavior versions: Python 2.7 _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue36417> _______________________________________
Change by Serhiy Storchaka <storchaka+cpython@gmail.com>: ---------- keywords: +easy stage: -> needs patch _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue36417> _______________________________________
Change by zheng <zheng0.tao@gmail.com>: ---------- keywords: +patch pull_requests: +12684 stage: needs patch -> patch review _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue36417> _______________________________________
zheng <zheng0.tao@gmail.com> added the comment: I propose we copy over the exact changes made to the Python 3 documentation. I looked through the code mentioned in the other thread. Namely, `Objects/unicodeobject.c` and `Tools/unicode/makeunicodedata.py`. The implementation is identical between python 2 and python 3. The only difference appears to be the unicode version used. # decimal digit, integer digit decimal = 0 if record[6]: flags |= DECIMAL_MASK decimal = int(record[6]) digit = 0 if record[7]: flags |= DIGIT_MASK digit = int(record[7]) if record[8]: flags |= NUMERIC_MASK numeric.setdefault(record[8], []).append(char) Another form of validation I did was enumerate all the digits and decimals and compare between versions. It looks like the general change is that there are a bunch of new unicode characters introduced in python 3. The exception is NEW TAI LUE THAM DIGIT ONE which gets recategorized as a digit. python 2, compiled with UCS4 for u in map(unichr, list(range(0x10FFFF))): if unicode.isdigit(u): print(unicodedata.name(u)) python 3 for u in map(chr, range(0x10FFFF)): if str.isdigit(u): print(name(u)) ---------- nosy: +zheng _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue36417> _______________________________________
Zachary Ware <zachary.ware@gmail.com> added the comment: As Python 2.7 has reached EOL and the branch is closed to regular maintenance, I'm closing the issue. Thanks for the report and patch anyway! ---------- nosy: +zach.ware resolution: -> out of date stage: patch review -> resolved status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue36417> _______________________________________
participants (4)
-
PEW's Corner
-
Serhiy Storchaka
-
Zachary Ware
-
zheng