[Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float

Tim Hochberg tim.hochberg at cox.net
Mon Apr 10 09:13:03 EDT 2006


Charles R Harris wrote:

> Tim,
>
> On 4/9/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     Let me just add that, since this seems to cause confusion, it would be
>     appropriate to amend the docstring tobe explicit that this always
>     returns an integral floating point value. If someone wants to suggest
>     wording, I can figure out where to put it. One possibility is:
>
>         "y = floor(x) elementwise largest integer <= x; note that the
>     result
>     is a floating point value"
>
>     or
>
>         "y = floor(x) elementwise largest integral float <= x"
>
>
> How about, "for each item in x returns the largest integral float <= 
> item."

That seems pretty good. I'll wait a day or so and see what else shows up.

>
> Chuck
>
> P.S.
>
> I too once found the C definition of the floor function annoying, but 
> I got used to it. Sorta like getting used to a broken leg. The main 
> problem is that the result can't be used as an index without 
> conversion to a "real" integer. Integers aren't members of the reals 
> (or rationals): apart from +/- 1, integers don't have inverses.

> There happens to be an injective ring homomorphism of the integers 
> into the reals, but that is not the same thing.

I'm not conversant with the terminology [here I rummage through google 
to try to get the terminology sort of right], but as I understand it 
integers (I) are a subset of reals (R). The ring that you contruct with 
integers consists of the set of integers plus the operations of 
addition/subtraction and multiplication as well as an identity. I've 
seen that specified as  something like (I, +/-, *, 0). Similarly, the 
set of reals (R) and the field that one constructs from them are not 
really the same thing. So while the ring of integers is not a subset of 
the field of reals (the statement doesn't even make sense when put that 
way),the set of integers is a subset of the set of reals. I think that 
most people, outside of computer programmers and perhaps math majors, 
think of the set of integers, not the field of integers, to the extent 
that they think about integers and reals at all. I imagine most people 
would conjure up some Dali like image when confronted with the notion of 
a field of integerse!

(C-int, +/-, *, 0), actually forms a finite field which is not at all 
the same thing the field of integers. Bit twiddlers tend to understand 
and even exploit this, but a lot of people conflate the field of ints 
with the field of integers. This works fine as long as your values are 
small in magnitude, but eventually will rise up and bite you. Floats are 
even worse, since they don't even form a field, I think they're actually 
a semiring because of INF/NAN/IND, but I'm not certain about that. 
Issues with floating point pop up everywhere and if you squint the right 
way, you can blame them on their lack of fieldness. Which is closely 
tied to their finite range and precision, which is what bites people.

Because Python automatically promotes (Python) ints to (Python) longs, 
Python ints map, for most puposes, onto the field of integers. However, 
in numpy wer're stuck using C-ints for performance reasons, so we'd be 
wise to keep the differences between ints and integers in the back of 
our mind.

This is wandering rather far afield (although it's entertaining).

> On the other hand, ints are generally not big enough to hold all of 
> the integral doubles, so as a practical matter the originators made 
> the best choice. Things do get a bit weird for large floats because 
> above a certain threshold floats are already integral values.

Another issue at the moment is that integer division does an implicit 
flooring or truncation (I believe it's implementation dependant in C) in 
both C and Python, so if you aren't using floor to produce an index, 
something I've been known to do, having it return an integer could also 
lead to nasty suprises. For example:

def half_integer(x):
    "return nearest half integer below x"
    return floor(2*x) / 2

Would start failing mysteriously. Of course the above is an overflow 
magnet, so perhaps it's not the best example. Eventually, '/' is going 
to mean true_division and '//' will mean floor_division, so this 
particular issue will go away.

Regards,

-tim



>
>
>






More information about the NumPy-Discussion mailing list