Hi,
I wrote a reference implementation for a C ufunc, `isint`, which returns True for integers and False for non-integers, found here: https://github.com/madphysicist/isint_ufunc. The idea came from a Stack Overflow question of mine, which has gotten a fair number of views and even some upvotes: https://stackoverflow.com/q/35042128/2988730. The current "recommended" solution is to use ``((x % 1) == 0)``. This is slower and more cumbersome because of the math operations and the temporary storage. My version returns a single array of booleans with no intermediaries, and is between 5 and 40 times faster, depending on the type and size of the input.
If you are interested in taking a look, there is a suite of tests and a small benchmarking script that compares the ufunc against the modulo expression. The entire thing currently works with bit twiddling on an appropriately converted integer representation of the number. It assumes a standard IEEE754 representation for float16, float32, float64. The extended 80-bit float128 format gets some special treatment because of the explicit integer bit. Complex numbers are currently integers only if they are real and integral. Integer types (including bool) are always integers. Time and text raise TypeErrors, since their integerness is meaningless.
If a consensus forms that this is something appropriate for numpy, I will need some pointers on how to package up C code properly. This was an opportunity for me to learn to write a basic ufunc. I am still a bit confused about where code like this would go, and how to harness numpy's code generation. I put comments in my .c and .h file showing how I would expect the generators to look, but I'm not sure where to plug something like that into numpy. It would also be nice to test on architectures that have something other than a 80-bit extended long double instead of a proper float128 quad-precision number.
Please let me know your thoughts.
Regards,
- Joe
At least some of the commenters on that StackOverflow page need a slightly stronger check: not only is_integer(x), but also "np.iinfo(dtype).min <= x <= np.info(dtype).max" for some particular dtype. i.e. "Can I losslessly set these values into the array I already have?"
On Thu, Dec 30, 2021 at 4:34 PM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Hi,
I wrote a reference implementation for a C ufunc, `isint`, which returns True for integers and False for non-integers, found here: https://github.com/madphysicist/isint_ufunc. The idea came from a Stack Overflow question of mine, which has gotten a fair number of views and even some upvotes: https://stackoverflow.com/q/35042128/2988730. The current "recommended" solution is to use ``((x % 1) == 0)``. This is slower and more cumbersome because of the math operations and the temporary storage. My version returns a single array of booleans with no intermediaries, and is between 5 and 40 times faster, depending on the type and size of the input.
If you are interested in taking a look, there is a suite of tests and a small benchmarking script that compares the ufunc against the modulo expression. The entire thing currently works with bit twiddling on an appropriately converted integer representation of the number. It assumes a standard IEEE754 representation for float16, float32, float64. The extended 80-bit float128 format gets some special treatment because of the explicit integer bit. Complex numbers are currently integers only if they are real and integral. Integer types (including bool) are always integers. Time and text raise TypeErrors, since their integerness is meaningless.
If a consensus forms that this is something appropriate for numpy, I will need some pointers on how to package up C code properly. This was an opportunity for me to learn to write a basic ufunc. I am still a bit confused about where code like this would go, and how to harness numpy's code generation. I put comments in my .c and .h file showing how I would expect the generators to look, but I'm not sure where to plug something like that into numpy. It would also be nice to test on architectures that have something other than a 80-bit extended long double instead of a proper float128 quad-precision number.
Please let me know your thoughts.
Regards,
- Joe
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jbrockmendel@gmail.com
Is adding arbitrary optional parameters a thing with ufuncs? I could easily add upper and lower bounds checks.
On Thu, Dec 30, 2021, 20:56 Brock Mendel jbrockmendel@gmail.com wrote:
At least some of the commenters on that StackOverflow page need a slightly stronger check: not only is_integer(x), but also "np.iinfo(dtype).min <= x <= np.info(dtype).max" for some particular dtype. i.e. "Can I losslessly set these values into the array I already have?"
On Thu, Dec 30, 2021 at 4:34 PM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Hi,
I wrote a reference implementation for a C ufunc, `isint`, which returns True for integers and False for non-integers, found here: https://github.com/madphysicist/isint_ufunc. The idea came from a Stack Overflow question of mine, which has gotten a fair number of views and even some upvotes: https://stackoverflow.com/q/35042128/2988730. The current "recommended" solution is to use ``((x % 1) == 0)``. This is slower and more cumbersome because of the math operations and the temporary storage. My version returns a single array of booleans with no intermediaries, and is between 5 and 40 times faster, depending on the type and size of the input.
If you are interested in taking a look, there is a suite of tests and a small benchmarking script that compares the ufunc against the modulo expression. The entire thing currently works with bit twiddling on an appropriately converted integer representation of the number. It assumes a standard IEEE754 representation for float16, float32, float64. The extended 80-bit float128 format gets some special treatment because of the explicit integer bit. Complex numbers are currently integers only if they are real and integral. Integer types (including bool) are always integers. Time and text raise TypeErrors, since their integerness is meaningless.
If a consensus forms that this is something appropriate for numpy, I will need some pointers on how to package up C code properly. This was an opportunity for me to learn to write a basic ufunc. I am still a bit confused about where code like this would go, and how to harness numpy's code generation. I put comments in my .c and .h file showing how I would expect the generators to look, but I'm not sure where to plug something like that into numpy. It would also be nice to test on architectures that have something other than a 80-bit extended long double instead of a proper float128 quad-precision number.
Please let me know your thoughts.
Regards,
- Joe
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jbrockmendel@gmail.com
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jfoxrabinovitz@gmail.com
On Fri, Dec 31, 2021 at 1:36 AM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Hi,
I wrote a reference implementation for a C ufunc, `isint`, which returns True for integers and False for non-integers, found here: https://github.com/madphysicist/isint_ufunc. <snip>
Shouldn't we keep the name of the stdlib float method?
>>> (3.0).is_integer() True
See https://docs.python.org/3/library/stdtypes.html#float.is_integer
András
The idea came from a Stack Overflow question of mine, which has gotten a fair number of views and even some upvotes: https://stackoverflow.com/q/35042128/2988730. The current "recommended" solution is to use ``((x % 1) == 0)``. This is slower and more cumbersome because of the math operations and the temporary storage. My version returns a single array of booleans with no intermediaries, and is between 5 and 40 times faster, depending on the type and size of the input.
If you are interested in taking a look, there is a suite of tests and a small benchmarking script that compares the ufunc against the modulo expression. The entire thing currently works with bit twiddling on an appropriately converted integer representation of the number. It assumes a standard IEEE754 representation for float16, float32, float64. The extended 80-bit float128 format gets some special treatment because of the explicit integer bit. Complex numbers are currently integers only if they are real and integral. Integer types (including bool) are always integers. Time and text raise TypeErrors, since their integerness is meaningless.
If a consensus forms that this is something appropriate for numpy, I will need some pointers on how to package up C code properly. This was an opportunity for me to learn to write a basic ufunc. I am still a bit confused about where code like this would go, and how to harness numpy's code generation. I put comments in my .c and .h file showing how I would expect the generators to look, but I'm not sure where to plug something like that into numpy. It would also be nice to test on architectures that have something other than a 80-bit extended long double instead of a proper float128 quad-precision number.
Please let me know your thoughts.
Regards,
- Joe
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: deak.andris@gmail.com
On Fri, Dec 31, 2021 at 5:46 AM Andras Deak deak.andris@gmail.com wrote:
On Fri, Dec 31, 2021 at 1:36 AM Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Hi,
I wrote a reference implementation for a C ufunc, `isint`, which returns True for integers and False for non-integers, found here: https://github.com/madphysicist/isint_ufunc. <snip>
Shouldn't we keep the name of the stdlib float method?
>>> (3.0).is_integer() True
See https://docs.python.org/3/library/stdtypes.html#float.is_integer
This sounds obvious in hindsight. I renamed it to is_integer, including the repo itself. The new link is here: https://github.com/madphysicist/is_integer_ufunc
András
The idea came from a Stack Overflow question of mine, which has gotten a fair number of views and even some upvotes: https://stackoverflow.com/q/35042128/2988730. The current "recommended" solution is to use ``((x % 1) == 0)``. This is slower and more cumbersome because of the math operations and the temporary storage. My version returns a single array of booleans with no intermediaries, and is between 5 and 40 times faster, depending on the type and size of the input.
If you are interested in taking a look, there is a suite of tests and a small benchmarking script that compares the ufunc against the modulo expression. The entire thing currently works with bit twiddling on an appropriately converted integer representation of the number. It assumes a standard IEEE754 representation for float16, float32, float64. The extended 80-bit float128 format gets some special treatment because of the explicit integer bit. Complex numbers are currently integers only if they are real and integral. Integer types (including bool) are always integers. Time and text raise TypeErrors, since their integerness is meaningless.
If a consensus forms that this is something appropriate for numpy, I will need some pointers on how to package up C code properly. This was an opportunity for me to learn to write a basic ufunc. I am still a bit confused about where code like this would go, and how to harness numpy's code generation. I put comments in my .c and .h file showing how I would expect the generators to look, but I'm not sure where to plug something like that into numpy. It would also be nice to test on architectures that have something other than a 80-bit extended long double instead of a proper float128 quad-precision number.
Please let me know your thoughts.
Regards,
- Joe
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: deak.andris@gmail.com
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: jfoxrabinovitz@gmail.com