[Python-ideas] [Python-Dev] Unicode minus sign in numeric conversions

Andrew Barnert abarnert at yahoo.com
Tue Jun 11 18:36:35 CEST 2013


On Jun 11, 2013, at 9:11, MRAB <python at mrabarnett.plus.com> wrote:

> On 09/06/2013 17:13, Stephen J. Turnbull wrote:
>> Steven D'Aprano writes:
>> 
>>  > > But Python goes much farther.  float('٢.๑') also returns 2.1 (not to
>>  > > mention that int('٢๑') returns 21).
>>  >
>>  > Yes. And why is this a problem? There is no ambiguity. It might
>>  > look untidy to be mixing Arab and Thai numerals in the same number,
>>  > but it is still well-defined.
>> 
>> To whom?  Unicode didacts, maybe, but I doubt there are any real users
>> who would consider that well-defined.  So the same arguments you made
>> for not permitting non-ASCII numerals in Python source code apply
>> here, although they are somewhat weaker when applied to numeric data
>> expressed as text.
>> 
>> In any case, there's not really that much use for this generality of
>> numerals.  On the one hand, I think these days anyone who uses
>> information technology is fluent in ASCII numeration.  On the other,
>> if you want to allow people to write in other scripts, you probably
>> are dealing with "naive" users who should be allowed to use grouping
>> characters and the usual conventions for their locale, and int () and
>> float() just aren't good enough anyway.
> I was thinking that we could also add a function for numeric
> translation/transliteration somewhere:
> 
>>>> # Translate toBengali
>>>> translate_number('0123456789', 'bengali')
> '০১২৩৪৫৬৭৮৯'
>>>> # Translate to Oriya
>>>> translate_number('0123456789', 'oriya')
> '୦୧୨୩୪୫୬୭୮୯'
> >>> # Defaults to translating to ASCII range
>>>> translate_number('୦୧୨୩୪୫୬୭୮୯')
> '0123456789'
> 
> Non-numeric strings and mixed scripts would raise an exception.

I like this, but I'm not sure how to completely specify it, or how to describe it.

What does translate_number('-1.2e+3', 'oriya') return? Or '2j'?

What about translate_number('二万三十') or '2万3十'? Or does it only handle Arabic-style place-value numerals?

If this is meant to solve the problem with "naive users", does it handle grouping characters as well, even though they can't be used with int or float? Or locale decimal points? Or parens for negatives?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130611/e5d3c600/attachment-0001.html>


More information about the Python-ideas mailing list