Fredrik Lundh wrote:
But really, I don't suspect that anyone is going to do serious character to number conversion on these esoteric characters. Plain old digits will do just as they always have ...
Which is why I have to wonder whether there's *any* value in exposing the numeric-value property beyond regular old digits.
the unicode database has three fields dealing with the numeric value: decimal digit value (integer), digit value (integer), and numeric value (integer *or* rational):
"This is a numeric field. If the character has the numeric property, as specified in Chapter 4 of the Unicode Standard, the value of that character is represented with an integer or rational number in this field."
here's today's proposal: let's claim that it's a bug to return a float from "numeric", and change it to return a string instead.
Hmm, how about making the return format an option ?
unicodedata.numeric(char, format=('float' (default), 'string', 'fraction'))
(this will match "decomposition", which is also "broken" -- it really should return a tag followed by a sequence of unicode characters).
unicodedata.decomposition(char, format=('string' (default), 'tuple'))
I'd opt for making the API more customizable rather than trying to find the one and only true return format ;-)