[Tutor] Limitation of int() in converting strings

Sat Dec 22 18:38:53 CET 2012

On 22 December 2012 01:34, Steven D'Aprano <steve at pearwood.info> wrote:
> On 18/12/12 01:36, Oscar Benjamin wrote:
>
>> I think it's unfortunate that Python's int() function combines two
>> distinct behaviours in this way. In different situations int() is used
>> to:
>> 1) Coerce an object of some type other than int into an int without
>> changing the value of the integer that the object represents.
>
> The second half of the sentence (starting from "without changing") is not
> justified. You can't safely make that assumption. All you know is that
> calling int() on an object is intended to convert the object to an int,
> in whatever way is suitable for that object. In some cases, that will
> be numerically exact (e.g. int("1234") will give 1234), in other cases it
> will not be.

If I was to rewrite that sentence  would replace the word 'integer'
with 'number' but otherwise I'm happy with it. Your reference to
"numerically exact" shows that you understood exactly what I meant.

>> 2) Round an object with a non-integer value to an integer value.
>
>
> int() does not perform rounding (except in the most generic sense that any
> conversion from real-valued number to integer is "rounding"). That is what
> the round() function does. int() performs truncating: it returns the
> integer part of a numeric value, ignoring any fraction part:

I was surprised by your objection to my use of the word "rounding"
here. So I looked it up on Wikipedia:
http://en.wikipedia.org/wiki/Rounding#Rounding_to_integer

That section describes "round toward zero (or truncate..." which is
essentially how I would have put it, and also how you put it below:

>
> * truncate, or round towards zero (drop any fraction part);

So I'm not really sure what your objection is to that, though you are
free to prefer the word truncate to round in this case (and I am free
to disagree).

<snip>
> So you shouldn't think of int(number) as "convert number to an int", since
> that is ambiguous. There are at least six common ways to convert arbitrary
> numbers to ints:

This is precisely my point. I would prefer if if int(obj) would fail
on non-integers leaving me with the option of calling an appropriate
rounding function. After catching RoundError (or whatever) you would
know that you have a number type object that can be passed to round,
ceil, floor etc.

> Python provides truncation via the int and math.trunc functions, floor and
> ceiling via math.floor and math.ceil, and round to nearest via round.
> In Python 2, ties are rounded up, which is biased; in Python 3, the
> unbiased banker's rounding is used.

I wasn't aware of this change. Thanks for that.

> Instead, you should consider int(number) to be one of a pair of functions,
> "return integer part", "return fraction part", where unfortunately the
> second function isn't provided directly. In general though, you can get
> the fractional part of a number with "x % 1". For floats, math.modf also
> works.

Assuming that you know you have an object that supports algebraic
operations in a sensible way then this works, although the
complementary function for "x % 1" would be "x // 1" or
"math.floor(x)" rather than "int(x)". To get the complementary
function for "int(x)" you could do "math.copysign(abs(x) % 1, x)"
(maybe there's a simpler way):

$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def reconstruct(x):
...     return int(x) + x % 1
...
>>> reconstruct(1)
1
>>> reconstruct(1.5)
1.5
>>> reconstruct(-2)
-2
>>> reconstruct(-2.5)
-1.5

> So, in a sense int() does to double-duty as both a constructor of ints
> from non-numbers such as strings, and as a "get integer part" function for
> numbers. I'm okay with that.

And I am not.

Oscar