Which is why I have to wonder whether there's *any* value in exposing the numeric-value property beyond regular old digits.
Running (in IDLE or PythonWin with a font that covers most of Unicode like Tahoma): import unicodedata for c in range(0x10000): x=unichr(c) try: b = unicodedata.numeric(x) #print "numeric:", repr(x) try: a = unicodedata.digit(x) if a != b: print "bad" , repr(x) except: print "Numeric but not digit", hex(c), x.encode("utf8"), "numeric ->", b except: pass
Finds about 130 characters. The only ones I feel are worth worrying about are the half, quarters and eighths (0xbc, 0xbd, 0xbe, 0x215b, 0x215c, 0x215d, 0x215e) which are commonly used for expressing the prices of stocks and commodities in the US. This may be rarely used but it is better to have it available than to have people coding up their own translation tables.
The 0x302* 'Hangzhou' numerals look like they should be classified as digits.