[Python-ideas] [Python-Dev] Unicode minus sign in numeric conversions

Terry Jan Reedy tjreedy at udel.edu
Sun Jun 9 08:21:25 CEST 2013


On 6/9/2013 1:30 AM, Guido van Rossum wrote:
> I'm beginning to feel that it was even a mistake to accept all those
> other Unicode decimal digits, because it leads to the mistaken belief
> that one can parse a number without knowing the locale. Python's
> position about this is very different from what the heuristics you
> quote seem to suggest: "refuse the temptation to guess" leads to our
> current, simple rule which only accepts '.' as the decimal indicator,
> and leaves localized parsing strictly to the application (or to some
> other library module).

The referenced algorithm is about extracting number literals, especially 
integel, out of text. Doing that 'properly' depends on local and 
purpose. Int is about converting a literal to its numberic value, 
whether written directly or extracted. Once a decision is made to 
convert a literal, conversion itself (for integers) is not location 
dependent.

> Still, I suppose I could defend the current behavior from the
> perspective of writing such a localized parser -- at some point you've
> got a digit and you need to know its numeric value, and it is
> convenient that int(c) does so.

Exactly. I believe that is why it was extended to do that.

tjr




More information about the Python-ideas mailing list