[docs] [issue25275] Documentation v/s behaviour mismatch wrt integer literals containing non-ASCII characters
Shreevatsa R
report at bugs.python.org
Wed Sep 30 22:45:10 CEST 2015
Shreevatsa R added the comment:
Minor difference, but the relevant function for int() is not quite isdigit(), e.g.:
>>> import unicodedata
>>> s = u'\u2460'
>>> unicodedata.name(s)
'CIRCLED DIGIT ONE'
>>> print s
①
>>> s.isdigit()
True
>>> s.isdecimal()
False
>>> int(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'decimal' codec can't encode character u'\u2460' in position 0: invalid decimal Unicode string
It seems to be isdecimal(), plus if there are other digits in the string then many leading and trailing space-like characters are also allowed (e.g. 5760 OGHAM SPACE MARK or 8195 EM SPACE or 12288 IDEOGRAPHIC SPACE:
>>> 987 == int(u'\u3000\n 987\u1680\t')
True
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25275>
_______________________________________
More information about the docs
mailing list