Re: Proposal for new function to determine if a float contains an integer

I would rather suggest .is_integer(integer_dtype) signature because knowing that 1e300 is an integer is not very useful in the numpy world, since this integer number is not representable as a numpy.integer dtype. Note that in python assert not f.is_integer() or int(f) == f never fails because integers have unlimited precision but this does would not map into assert ( ~f_arr.is_integer() | (np.int64(f_arr) == f.arr) ).all() because of possible OverflowErrors. Stefano
On 31 Dec 2021, at 04:46, numpy-discussion-request@python.org wrote:
Is adding arbitrary optional parameters a thing with ufuncs? I could easily add upper and lower bounds checks.
On Thu, Dec 30, 2021, 20:56 Brock Mendel <jbrockmendel@gmail.com <mailto:jbrockmendel@gmail.com>> wrote: At least some of the commenters on that StackOverflow page need a slightly stronger check: not only is_integer(x), but also "np.iinfo(dtype).min <= x <= np.info <http://np.info/>(dtype).max" for some particular dtype. i.e. "Can I losslessly set these values into the array I already have?"

Stefano, That is an excellent point. Just to make sure I understand, would an interface like `is_integer(a, int_dtype=None)` be satisfactory? That way, there are no bounds by default (call it python integer bounds), but the user can specify a limited type at will. An alternative would be something like `is_integer(a, bits=None, unsigned=False)`. This would have the advantage of testing against hypothetical types, which might be useful sometimes, or just annoying. I could always allow a two-element tuple in as an argument to the first version. While I completely agree with the idea behind adding this test, one big question remains: can I add arbirary arguments to a ufunc? - Joe On Sat, Jan 1, 2022 at 5:41 AM Stefano Miccoli <stefano.miccoli@polimi.it> wrote:
I would rather suggest .is_integer(integer_dtype) signature because knowing that 1e300 is an integer is not very useful in the numpy world, since this integer number is not representable as a numpy.integer dtype.
Note that in python
assert not f.is_integer() or int(f) == f
never fails because integers have unlimited precision but this does would not map into
assert ( ~f_arr.is_integer() | (np.int64(f_arr) == f.arr) ).all()
because of possible OverflowErrors.
Stefano
On 31 Dec 2021, at 04:46, numpy-discussion-request@python.org wrote:
Is adding arbitrary optional parameters a thing with ufuncs? I could easily add upper and lower bounds checks.
On Thu, Dec 30, 2021, 20:56 Brock Mendel <jbrockmendel@gmail.com> wrote:
At least some of the commenters on that StackOverflow page need a slightly stronger check: not only is_integer(x), but also "np.iinfo(dtype).min <= x <= np.info(dtype).max" for some particular dtype. i.e. "Can I losslessly set these values into the array I already have?"
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jfoxrabinovitz@gmail.com

Is there a guide on how to pacakage non-ufunc functions with multiple loops? Something like sort? It looks like there is no way of adding additional arguments to a ufunc as of yet. On a related note, would it be more useful to have a function that returns the number of bits required to store a number, or -1 if it has a fractional part? Then you could just test something like ``(k := integer_bits(a)) < 64 & k > 0``. - Joe On Sat, Jan 1, 2022 at 5:55 AM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Stefano,
That is an excellent point. Just to make sure I understand, would an interface like `is_integer(a, int_dtype=None)` be satisfactory? That way, there are no bounds by default (call it python integer bounds), but the user can specify a limited type at will. An alternative would be something like `is_integer(a, bits=None, unsigned=False)`. This would have the advantage of testing against hypothetical types, which might be useful sometimes, or just annoying. I could always allow a two-element tuple in as an argument to the first version.
While I completely agree with the idea behind adding this test, one big question remains: can I add arbirary arguments to a ufunc?
- Joe
On Sat, Jan 1, 2022 at 5:41 AM Stefano Miccoli <stefano.miccoli@polimi.it> wrote:
I would rather suggest .is_integer(integer_dtype) signature because knowing that 1e300 is an integer is not very useful in the numpy world, since this integer number is not representable as a numpy.integer dtype.
Note that in python
assert not f.is_integer() or int(f) == f
never fails because integers have unlimited precision but this does would not map into
assert ( ~f_arr.is_integer() | (np.int64(f_arr) == f.arr) ).all()
because of possible OverflowErrors.
Stefano
On 31 Dec 2021, at 04:46, numpy-discussion-request@python.org wrote:
Is adding arbitrary optional parameters a thing with ufuncs? I could easily add upper and lower bounds checks.
On Thu, Dec 30, 2021, 20:56 Brock Mendel <jbrockmendel@gmail.com> wrote:
At least some of the commenters on that StackOverflow page need a slightly stronger check: not only is_integer(x), but also "np.iinfo(dtype).min <= x <= np.info(dtype).max" for some particular dtype. i.e. "Can I losslessly set these values into the array I already have?"
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jfoxrabinovitz@gmail.com

Have any of the numpy devs weighed in on this? If an efficient version of this were available in numpy there is a lot of pandas code I would enjoy ripping out. On Sun, Jan 2, 2022 at 11:16 AM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Is there a guide on how to pacakage non-ufunc functions with multiple loops? Something like sort? It looks like there is no way of adding additional arguments to a ufunc as of yet.
On a related note, would it be more useful to have a function that returns the number of bits required to store a number, or -1 if it has a fractional part? Then you could just test something like ``(k := integer_bits(a)) < 64 & k > 0``.
- Joe
On Sat, Jan 1, 2022 at 5:55 AM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Stefano,
That is an excellent point. Just to make sure I understand, would an interface like `is_integer(a, int_dtype=None)` be satisfactory? That way, there are no bounds by default (call it python integer bounds), but the user can specify a limited type at will. An alternative would be something like `is_integer(a, bits=None, unsigned=False)`. This would have the advantage of testing against hypothetical types, which might be useful sometimes, or just annoying. I could always allow a two-element tuple in as an argument to the first version.
While I completely agree with the idea behind adding this test, one big question remains: can I add arbirary arguments to a ufunc?
- Joe
On Sat, Jan 1, 2022 at 5:41 AM Stefano Miccoli <stefano.miccoli@polimi.it> wrote:
I would rather suggest .is_integer(integer_dtype) signature because knowing that 1e300 is an integer is not very useful in the numpy world, since this integer number is not representable as a numpy.integer dtype.
Note that in python
assert not f.is_integer() or int(f) == f
never fails because integers have unlimited precision but this does would not map into
assert ( ~f_arr.is_integer() | (np.int64(f_arr) == f.arr) ).all()
because of possible OverflowErrors.
Stefano
On 31 Dec 2021, at 04:46, numpy-discussion-request@python.org wrote:
Is adding arbitrary optional parameters a thing with ufuncs? I could easily add upper and lower bounds checks.
On Thu, Dec 30, 2021, 20:56 Brock Mendel <jbrockmendel@gmail.com> wrote:
At least some of the commenters on that StackOverflow page need a slightly stronger check: not only is_integer(x), but also "np.iinfo(dtype).min <= x <= np.info(dtype).max" for some particular dtype. i.e. "Can I losslessly set these values into the array I already have?"
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jfoxrabinovitz@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jbrockmendel@gmail.com

On Sun, 2022-01-23 at 14:12 -0800, Brock Mendel wrote:
Have any of the numpy devs weighed in on this? If an efficient version of this were available in numpy there is a lot of pandas code I would enjoy ripping out.
As this is a Python float method, that is an argument for implementing it without much thought – at least so long it mostly aligns with Python. If we are worried about arbitrary precision integers, maybe we should go with "index" and limit ourselves to "indexable integers"? (For such a method it may make sense if 2**60 would be rejected as it is within the range where double spacing is larger than 1, but not sure what speed implications that would have.) Generally, `dtype=...` argument seems a bit much of a zoo of possibilities to me. Because `int64.is_integer(np.uint8)` would make a lot of sense suddenly. To answer the question about ufuncs: No you can't use ufuncs without first inventing and implementing new API that can deal with this reasonably (unless you make a family of ufuncs for each `dtype=...`). That part is easier now than it once was, but that doesn't make it easy... A try/except approach is maybe a bit more "accessible" in this regard. Since you could make a ufunc: np.try_cast_safely(arr, dtype=np.uint8) That assumes things will work out but raises an error if they don't. A probably weird possibility to do "int" specific, would be: # cast to intp array, but do it "differently": arr.astype(Indexable, copy=False) Where Indexable is a special DType. That requires only a very minor API extension. I don't mind that extension, but it is probably a bit too strange that the `Indexable` indicates a "fail on loss of precision" mode. Cheers, Sebastian
On Sun, Jan 2, 2022 at 11:16 AM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Is there a guide on how to pacakage non-ufunc functions with multiple loops? Something like sort? It looks like there is no way of adding additional arguments to a ufunc as of yet.
On a related note, would it be more useful to have a function that returns the number of bits required to store a number, or -1 if it has a fractional part? Then you could just test something like ``(k := integer_bits(a)) < 64 & k > 0``.
- Joe
On Sat, Jan 1, 2022 at 5:55 AM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Stefano,
That is an excellent point. Just to make sure I understand, would an interface like `is_integer(a, int_dtype=None)` be satisfactory? That way, there are no bounds by default (call it python integer bounds), but the user can specify a limited type at will. An alternative would be something like `is_integer(a, bits=None, unsigned=False)`. This would have the advantage of testing against hypothetical types, which might be useful sometimes, or just annoying. I could always allow a two-element tuple in as an argument to the first version.
While I completely agree with the idea behind adding this test, one big question remains: can I add arbirary arguments to a ufunc?
- Joe
On Sat, Jan 1, 2022 at 5:41 AM Stefano Miccoli <stefano.miccoli@polimi.it> wrote:
I would rather suggest .is_integer(integer_dtype) signature because knowing that 1e300 is an integer is not very useful in the numpy world, since this integer number is not representable as a numpy.integer dtype.
Note that in python
assert not f.is_integer() or int(f) == f
never fails because integers have unlimited precision but this does would not map into
assert ( ~f_arr.is_integer() | (np.int64(f_arr) == f.arr) ).all()
because of possible OverflowErrors.
Stefano
On 31 Dec 2021, at 04:46, numpy-discussion-request@python.org wrote:
Is adding arbitrary optional parameters a thing with ufuncs? I could easily add upper and lower bounds checks.
On Thu, Dec 30, 2021, 20:56 Brock Mendel <jbrockmendel@gmail.com> wrote:
At least some of the commenters on that StackOverflow page need a slightly stronger check: not only is_integer(x), but also "np.iinfo(dtype).min <= x <= np.info(dtype).max" for some particular dtype. i.e. "Can I losslessly set these values into the array I already have?"
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jfoxrabinovitz@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jbrockmendel@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
participants (4)
-
Brock Mendel
-
Joseph Fox-Rabinovitz
-
Sebastian Berg
-
Stefano Miccoli