[Tutor] why does this raise an exception...

Fri May 14 06:12:37 EDT 2004

At 16:55 2004-05-13 +0300, roman wrote:
>When converting a few string values to integers with int() I found out
>that trying to convert "2.1", i.e. a string that represents a floating
>point value to an integer raises an exception. Why does this raise an
>exception while something like float("9") does not?

This is just the way it's described in the manual. I guess the real answer
is that Guido felt that this was the most reasonable behaviour. If you
think more about it, I suspect you will agree.

It might seem strange that "int(2.7)" will truncate the float to an
integer with value 2, and "int('2')" will convert the string "2" to
the integer 2, but you can't do both things at once, i.e. get the
string "2.7" converted to the integer 2 in one step.

But when you look at it like this, you realize that it means that you
would either first have to convert the string "2.7" to a float, and
then truncate and convert that to an integer, or you would have to cut
the string after the first uninterrupted sequence of digits (or maybe
interpret the string in some more advanced way).

It's one of Python's mottos that "Explicit is better than implicit",
so Pythonistas generally think that if "int(float(aString))" is what
you want, you should spell that out, not let the python interpreter
guess whether "int(aString)" with "aString='3.7'" implies
"int(float(aString))" or whether the input value was actually outside
the intended range.

And if we allow "int('2.7')" I suppose we should also allow something
like "int('2e30')". "int(2e30)" works, but if you think it will return
2000000000000000000000000000000 you are wrong. It yields
2000000000000000039769249677312. This is because floating point numbers
like 2e30 are approximations. There is a limited precision in floats,
and people who work with floats should be aware of that. Ints and longs
are exact.

So, if you do "int(float(2e30))" you *should* expect to get
2000000000000000039769249677312. But what should you expect to get from
int("2e30")? Some might argue that it should be the same as
"int(float(2e30))", but it also seems reasonable to say that
int("2e30") should be the same as "int(2000000000000000000000000000000)".
I think most Python programmer would expect that. After all, 2e30 is
mathematically an integer, even if the literal "2e30" is interpreted
as a float in Python in for instance "big = 2e30". And what about
"int('2.5e1')". That should yield 25 just like "int(2.5e1)", right?

It's very useful that you allow the int() constructor to convert floats
to ints, and the decision has been made to truncate decimals towards
zero. Fine, this is something you often do, and if you want to round
to nearest, or towards plus or minus infinity, you can use int(round(x)),
int(math.ceil(x)) or int(math.floor(x)). So, int complements the other
existing functions. There is no ambiguity here.

For strings, the decision has been made to only allow a sequence of at
least one digit optionally preceeded by a plus or minus and optionally
surrounded by leading and trailing whitespace. All other strings are
rejected. I think this is a good choice, since allowing more than that
would open for various interpretations of what value to expect from
a certain string.

It's also a practical choice. It's much easier to write the C code to
handle the strings like the ones int() accepts than to write a parser
that does the right things with things like -12342.345435345e-2 without
going via floats that loose the integer precision for large integers.
It will also run faster. Even if performance isn't considered the most
important aspect of Python, I doubt that programmers in general would
like int() to be slowed down to make int("2.5") work.

If you want "int(float(x))" and think it's too much typing, you can write
a function for it:

def flint(x): return int(float(x))

If you want things like int("2.56e30") to give an exact result (which flint
won't), you have to write some more code. This is an unusual use case, and
fits better in a custom module than in the Python core.

I for one is more interested in converting local conventions for numeric
strings such as "1.000.000,00" or "1,000,000.00" or "1 000 000,00" than
scientific notations. Sometimes I want to convert unit prefixes such as
k, M, G etc as well. Neither of these things belong in the core of the
language.

See also http://diveintopython.org/unit_testing/stage_1.html for a more
exotic way of converting strings to numbers and vice versa. A special
case for someone with the name Roman...

--
Magnus Lycka (It's really Lyck&aring;), magnus at thinkware.se
Thinkware AB, Sweden, www.thinkware.se
I code Python ~ The Agile Programming Language