On 12/2/2010 4:48 PM, "Martin v. Löwis" wrote:
Am 02.12.2010 22:30, schrieb Steven D'Aprano:
Martin v. Löwis wrote:
Then these users should speak up and indicate their need, or somebody should speak up and confirm that there are users who actually want '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing system in which '١٢٣٤.٥٦e4' means 12345600.0.
I'm not sure what you're after here.
That the current float() constructor accepts tons of bogus character strings and accepts them as numbers, and that it should stop doing so.
What bogus characters do the float() and int() constructors accept? As far as I can see, they only accepts numerals.
Not bogus characters, but bogus character strings. E.g. strings that mix digits from different scripts, and mix them with the Python decimal separator.
Notice that Python does *not* currently support printing numbers in other scripts - even though this may actually be more useful than parsing.
Lack of one function, even if more useful, does not imply that an existing function should be removed.
No. But if the specific function(ality) is not useful and underspecified, it should be removed.
So your problems with the current behaviour are:
(1) in some unspecified way, it's not done correctly;
No. My main concern is that it is not properly specified. If it was specified, I could then tell you what precisely is wrong about it. Right now, I can only give examples for input that it should not accept, and examples of input that it should, but does not accept.
(2) it belongs somewhere other than float() and int().
That's only because it also needs a parameter to specify what syntax to follow, somehow. That parameter could be explicit or implicit, and it could be to float or to some other function. But it must be available, and is not.
That second is awfully close to bike-shedding. Since you accept that Python *should* have the current behaviour
No, I don't. I think it behaves incorrectly, accepting garbage input and guessing some meaning out of it.
- how the current behaviour is incorrect;
See above: it accepts strings that do not denote real numbers in any writing system, and, despite the claim that the feature is there to support other writing systems, actually does not truly support other writing systems.
- your suggestions for correcting it; and
Make the current implementation exactly match the current documentation. I think the documentation is correct; the implementation is wrong.
- a concrete suggestion for where you would like to see the behaviour
moved to, and why that would be better than where it currently is.
The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module.
I agree with everything Martin says here. I think the basic premise is: you won't find strings "in the wild" that use non-ASCII digits but do use the ASCII dot as a decimal point. And that's what float() is looking for. (And that doesn't even begin to address what it expects for an exponent 'e'.)