Nice float -> integer conversion?

Hi,
Have I missed a fast way of doing nice float to integer conversion?
By nice I mean, rounding to the nearest integer, converting NaN to 0, inf, -inf to the max and min of the integer range? The astype method and cast functions don't do what I need here:
In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16) Out[40]: array([1, 0, 0, 0], dtype=int16)
In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf])) Out[41]: array([1, 0, 0, 0], dtype=int16)
Have I missed something obvious?
See y'all,
Matthew

On 11 Oct 2011, at 20:06, Matthew Brett wrote:
Have I missed a fast way of doing nice float to integer conversion?
By nice I mean, rounding to the nearest integer, converting NaN to 0, inf, -inf to the max and min of the integer range? The astype method and cast functions don't do what I need here:
In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16) Out[40]: array([1, 0, 0, 0], dtype=int16)
In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf])) Out[41]: array([1, 0, 0, 0], dtype=int16)
Have I missed something obvious?
np.[a]round comes closer to what you wish (is there consensus that NaN should map to 0?), but not quite there, and it's not really consistent either!
In [42]: c = np.zeros(4, np.int16) In [43]: d = np.zeros(4, np.int32) In [44]: np.around([1.6,np.nan,np.inf,-np.inf], out=c) Out[44]: array([2, 0, 0, 0], dtype=int16)
In [45]: np.around([1.6,np.nan,np.inf,-np.inf], out=d) Out[45]: array([ 2, -2147483648, -2147483648, -2147483648], dtype=int32)
Perhaps a starting point to harmonise this behaviour and get it closer to your expectations (it still would not be really nice having to define the output array first, I guess)...
Cheers, Derek

On Tue, Oct 11, 2011 at 3:06 PM, Derek Homeier derek@astro.physik.uni-goettingen.de wrote:
On 11 Oct 2011, at 20:06, Matthew Brett wrote:
Have I missed a fast way of doing nice float to integer conversion?
By nice I mean, rounding to the nearest integer, converting NaN to 0, inf, -inf to the max and min of the integer range? The astype method and cast functions don't do what I need here:
In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16) Out[40]: array([1, 0, 0, 0], dtype=int16)
In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf])) Out[41]: array([1, 0, 0, 0], dtype=int16)
Have I missed something obvious?
np.[a]round comes closer to what you wish (is there consensus that NaN should map to 0?), but not quite there, and it's not really consistent either!
In [42]: c = np.zeros(4, np.int16) In [43]: d = np.zeros(4, np.int32) In [44]: np.around([1.6,np.nan,np.inf,-np.inf], out=c) Out[44]: array([2, 0, 0, 0], dtype=int16)
In [45]: np.around([1.6,np.nan,np.inf,-np.inf], out=d) Out[45]: array([ 2, -2147483648, -2147483648, -2147483648], dtype=int32)
Perhaps a starting point to harmonise this behaviour and get it closer to your expectations (it still would not be really nice having to define the output array first, I guess)...
what numpy is this?
np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16)
array([ 1, -32768, -32768, -32768], dtype=int16)
np.__version__
'1.5.1'
a = np.ones(4, np.int16) a[:]=np.array([1.6, np.nan, np.inf, -np.inf]) a
array([ 1, -32768, -32768, -32768], dtype=int16)
I thought we get ValueError to avoid nan to zero bugs
a[2] = np.nan
Traceback (most recent call last): File "<pyshell#22>", line 1, in <module> a[2] = np.nan ValueError: cannot convert float NaN to integer
Josef
Cheers, Derek
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On 11.10.2011, at 9:18PM, josef.pktd@gmail.com wrote:
In [42]: c = np.zeros(4, np.int16) In [43]: d = np.zeros(4, np.int32) In [44]: np.around([1.6,np.nan,np.inf,-np.inf], out=c) Out[44]: array([2, 0, 0, 0], dtype=int16)
In [45]: np.around([1.6,np.nan,np.inf,-np.inf], out=d) Out[45]: array([ 2, -2147483648, -2147483648, -2147483648], dtype=int32)
Perhaps a starting point to harmonise this behaviour and get it closer to your expectations (it still would not be really nice having to define the output array first, I guess)...
what numpy is this?
This was 1.6.1 I did suppress a RuntimeWarning that was raised on the first call, though: In [33]: np.around([1.67,np.nan,np.inf,-np.inf], decimals=1, out=d) /sw/lib/python2.7/site-packages/numpy/core/fromnumeric.py:37: RuntimeWarning: invalid value encountered in multiply result = getattr(asarray(obj),method)(*args, **kwds)
np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16)
array([ 1, -32768, -32768, -32768], dtype=int16)
np.__version__
'1.5.1'
a = np.ones(4, np.int16) a[:]=np.array([1.6, np.nan, np.inf, -np.inf]) a
array([ 1, -32768, -32768, -32768], dtype=int16)
I thought we get ValueError to avoid nan to zero bugs
a[2] = np.nan
Traceback (most recent call last): File "<pyshell#22>", line 1, in <module> a[2] = np.nan ValueError: cannot convert float NaN to integer
On master, an integer out raises a TypeError for any float input - not sure I'd consider that an improvement…
np.__version__
'2.0.0.dev-8f689df'
np.around([1.6,-23.42, -13.98, 0.14], out=c)
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/derek/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2277, in around return _wrapit(a, 'round', decimals, out) File "/Users/derek/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 37, in _wrapit result = getattr(asarray(obj),method)(*args, **kwds) TypeError: ufunc 'rint' output (typecode 'd') could not be coerced to provided output parameter (typecode 'h') according to the casting rule “same_kind“
I thought the NaN might have been dealt with first, before casting to int, but that doesn't seem to be the case (on master, again):
np.around([1.6,np.nan,np.inf,-np.inf])
array([ 2., nan, inf, -inf])
np.around([1.6,np.nan,np.inf,-np.inf]).astype(np.int16)
array([2, 0, 0, 0], dtype=int16)
np.around([1.6,np.nan,np.inf,-np.inf]).astype(np.int32)
array([ 2, -2147483648, -2147483648, -2147483648], dtype=int32)
Cheers, Derek

Hi,
On Tue, Oct 11, 2011 at 5:30 PM, Derek Homeier derek@astro.physik.uni-goettingen.de wrote:
On 11.10.2011, at 9:18PM, josef.pktd@gmail.com wrote:
In [42]: c = np.zeros(4, np.int16) In [43]: d = np.zeros(4, np.int32) In [44]: np.around([1.6,np.nan,np.inf,-np.inf], out=c) Out[44]: array([2, 0, 0, 0], dtype=int16)
In [45]: np.around([1.6,np.nan,np.inf,-np.inf], out=d) Out[45]: array([ 2, -2147483648, -2147483648, -2147483648], dtype=int32)
Perhaps a starting point to harmonise this behaviour and get it closer to your expectations (it still would not be really nice having to define the output array first, I guess)...
what numpy is this?
This was 1.6.1 I did suppress a RuntimeWarning that was raised on the first call, though: In [33]: np.around([1.67,np.nan,np.inf,-np.inf], decimals=1, out=d) /sw/lib/python2.7/site-packages/numpy/core/fromnumeric.py:37: RuntimeWarning: invalid value encountered in multiply result = getattr(asarray(obj),method)(*args, **kwds)
np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16)
array([ 1, -32768, -32768, -32768], dtype=int16)
np.__version__
'1.5.1'
a = np.ones(4, np.int16) a[:]=np.array([1.6, np.nan, np.inf, -np.inf]) a
array([ 1, -32768, -32768, -32768], dtype=int16)
I thought we get ValueError to avoid nan to zero bugs
a[2] = np.nan
Traceback (most recent call last): File "<pyshell#22>", line 1, in <module> a[2] = np.nan ValueError: cannot convert float NaN to integer
On master, an integer out raises a TypeError for any float input - not sure I'd consider that an improvement…
np.__version__
'2.0.0.dev-8f689df'
np.around([1.6,-23.42, -13.98, 0.14], out=c)
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/derek/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2277, in around return _wrapit(a, 'round', decimals, out) File "/Users/derek/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 37, in _wrapit result = getattr(asarray(obj),method)(*args, **kwds) TypeError: ufunc 'rint' output (typecode 'd') could not be coerced to provided output parameter (typecode 'h') according to the casting rule “same_kind“
I thought the NaN might have been dealt with first, before casting to int, but that doesn't seem to be the case (on master, again):
np.around([1.6,np.nan,np.inf,-np.inf])
array([ 2., nan, inf, -inf])
np.around([1.6,np.nan,np.inf,-np.inf]).astype(np.int16)
array([2, 0, 0, 0], dtype=int16)
np.around([1.6,np.nan,np.inf,-np.inf]).astype(np.int32)
array([ 2, -2147483648, -2147483648, -2147483648], dtype=int32)
Just to whet the appetite:
In [85]: for t in np.sctypes['int'] + np.sctypes['uint']: ....: print np.array([np.nan], float).astype(t) ....: [0] [0] [-2147483648] [-2147483648] [-9223372036854775808] [0] [0] [2147483648] [2147483648] [9223372036854775808]
In [89]: for t in np.sctypes['int'] + np.sctypes['uint']: ....: print np.around(np.array([np.nan], float), out=np.zeros(1, t)) ....: /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/fromnumeric.py:2278: RuntimeWarning: invalid value encountered in rint return round(decimals, out) [0] [0] [-2147483648] [-2147483648] [-9223372036854775808] [0] [0] [2147483648] [2147483648] [9223372036854775808]
In [86]: np.__version__ Out[86]: '1.6.1'
Maybe it would be good to have a np.nice_round function?
See you,
Matthew

Hi,
On Tue, Oct 11, 2011 at 3:06 PM, Derek Homeier derek@astro.physik.uni-goettingen.de wrote:
On 11 Oct 2011, at 20:06, Matthew Brett wrote:
Have I missed a fast way of doing nice float to integer conversion?
By nice I mean, rounding to the nearest integer, converting NaN to 0, inf, -inf to the max and min of the integer range? The astype method and cast functions don't do what I need here:
In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16) Out[40]: array([1, 0, 0, 0], dtype=int16)
In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf])) Out[41]: array([1, 0, 0, 0], dtype=int16)
Have I missed something obvious?
np.[a]round comes closer to what you wish (is there consensus that NaN should map to 0?), but not quite there, and it's not really consistent either!
In [42]: c = np.zeros(4, np.int16) In [43]: d = np.zeros(4, np.int32) In [44]: np.around([1.6,np.nan,np.inf,-np.inf], out=c) Out[44]: array([2, 0, 0, 0], dtype=int16)
In [45]: np.around([1.6,np.nan,np.inf,-np.inf], out=d) Out[45]: array([ 2, -2147483648, -2147483648, -2147483648], dtype=int32)
Perhaps a starting point to harmonise this behaviour and get it closer to your expectations (it still would not be really nice having to define the output array first, I guess)...
Thanks - it hadn't occurred to me to try around with an output array - an interesting idea.
But - isn't this different but just as bad?
Best,
Matthew

On Tue, Oct 11, 2011 at 2:06 PM, Derek Homeier < derek@astro.physik.uni-goettingen.de> wrote:
On 11 Oct 2011, at 20:06, Matthew Brett wrote:
Have I missed a fast way of doing nice float to integer conversion?
By nice I mean, rounding to the nearest integer, converting NaN to 0, inf, -inf to the max and min of the integer range? The astype method and cast functions don't do what I need here:
In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16) Out[40]: array([1, 0, 0, 0], dtype=int16)
In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf])) Out[41]: array([1, 0, 0, 0], dtype=int16)
Have I missed something obvious?
np.[a]round comes closer to what you wish (is there consensus that NaN should map to 0?), but not quite there, and it's not really consistent either!
In a way, there is already consensus in the code. np.nan_to_num() by default converts nans to zero, and the infinities go to very large and very small.
>>> np.set_printoptions(precision=8) >>> x = np.array([np.inf, -np.inf, np.nan, -128, 128]) >>> np.nan_to_num(x) array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, -1.28000000e+002, 1.28000000e+002])
Ben Root

Hi,
On Tue, Oct 11, 2011 at 7:32 PM, Benjamin Root ben.root@ou.edu wrote:
On Tue, Oct 11, 2011 at 2:06 PM, Derek Homeier derek@astro.physik.uni-goettingen.de wrote:
On 11 Oct 2011, at 20:06, Matthew Brett wrote:
Have I missed a fast way of doing nice float to integer conversion?
By nice I mean, rounding to the nearest integer, converting NaN to 0, inf, -inf to the max and min of the integer range? The astype method and cast functions don't do what I need here:
In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16) Out[40]: array([1, 0, 0, 0], dtype=int16)
In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf])) Out[41]: array([1, 0, 0, 0], dtype=int16)
Have I missed something obvious?
np.[a]round comes closer to what you wish (is there consensus that NaN should map to 0?), but not quite there, and it's not really consistent either!
In a way, there is already consensus in the code. np.nan_to_num() by default converts nans to zero, and the infinities go to very large and very small.
>>> np.set_printoptions(precision=8) >>> x = np.array([np.inf, -np.inf, np.nan, -128, 128]) >>> np.nan_to_num(x) array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, -1.28000000e+002, 1.28000000e+002])
Right - but - we'd still need to round, and take care of the nasty issue of thresholding:
x = np.array([np.inf, -np.inf, np.nan, -128, 128]) x
array([ inf, -inf, nan, -128., 128.])
nnx = np.nan_to_num(x) nnx
array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, -1.28000000e+002, 1.28000000e+002])
np.rint(nnx).astype(np.int8)
array([ 0, 0, 0, -128, -128], dtype=int8)
So, I think nice_round would look something like:
def nice_round(arr, out_type): in_type = arr.dtype.type mx = floor_exact(np.iinfo(out_type).max, in_type) mn = floor_exact(np.iinfo(out_type).max, in_type) nans = np.isnan(arr) out = np.rint(np.clip(arr, mn, mx)).astype(out_type) out[nans] = 0 return out
with floor_exact being something like:
https://github.com/matthew-brett/nibabel/blob/range-dtype-conversions/nibabe...
See you,
Matthew

Hi,
On Sat, Oct 15, 2011 at 12:20 PM, Matthew Brett matthew.brett@gmail.com wrote:
Hi,
On Tue, Oct 11, 2011 at 7:32 PM, Benjamin Root ben.root@ou.edu wrote:
On Tue, Oct 11, 2011 at 2:06 PM, Derek Homeier derek@astro.physik.uni-goettingen.de wrote:
On 11 Oct 2011, at 20:06, Matthew Brett wrote:
Have I missed a fast way of doing nice float to integer conversion?
By nice I mean, rounding to the nearest integer, converting NaN to 0, inf, -inf to the max and min of the integer range? The astype method and cast functions don't do what I need here:
In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16) Out[40]: array([1, 0, 0, 0], dtype=int16)
In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf])) Out[41]: array([1, 0, 0, 0], dtype=int16)
Have I missed something obvious?
np.[a]round comes closer to what you wish (is there consensus that NaN should map to 0?), but not quite there, and it's not really consistent either!
In a way, there is already consensus in the code. np.nan_to_num() by default converts nans to zero, and the infinities go to very large and very small.
>>> np.set_printoptions(precision=8) >>> x = np.array([np.inf, -np.inf, np.nan, -128, 128]) >>> np.nan_to_num(x) array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, -1.28000000e+002, 1.28000000e+002])
Right - but - we'd still need to round, and take care of the nasty issue of thresholding:
x = np.array([np.inf, -np.inf, np.nan, -128, 128]) x
array([ inf, -inf, nan, -128., 128.])
nnx = np.nan_to_num(x) nnx
array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, -1.28000000e+002, 1.28000000e+002])
np.rint(nnx).astype(np.int8)
array([ 0, 0, 0, -128, -128], dtype=int8)
So, I think nice_round would look something like:
def nice_round(arr, out_type): in_type = arr.dtype.type mx = floor_exact(np.iinfo(out_type).max, in_type) mn = floor_exact(np.iinfo(out_type).max, in_type) nans = np.isnan(arr) out = np.rint(np.clip(arr, mn, mx)).astype(out_type) out[nans] = 0 return out
with floor_exact being something like:
https://github.com/matthew-brett/nibabel/blob/range-dtype-conversions/nibabe...
In case anyone is interested or for the sake of anyone later googling this thread -
I made a working version of nice_round:
https://github.com/matthew-brett/nibabel/blob/floating-stash/nibabel/casting...
Docstring: def nice_round(arr, int_type, nan2zero=True, infmax=False): """ Round floating point array `arr` to type `int_type`
Parameters ---------- arr : array-like Array of floating point type int_type : object Numpy integer type nan2zero : {True, False} Whether to convert NaN value to zero. Default is True. If False, and NaNs are present, raise CastingError infmax : {False, True} If True, set np.inf values in `arr` to be `int_type` integer maximum value, -np.inf as `int_type` integer minimum. If False, merely set infs to be numbers at or near the maximum / minumum number in `arr` that can be contained in `int_type`. Therefore False gives faster conversion at the expense of infs that are further from infinity.
Returns ------- iarr : ndarray of type `int_type`
Examples --------
nice_round([np.nan, np.inf, -np.inf, 1.1, 6.6], np.int16)
array([ 0, 32767, -32768, 1, 7], dtype=int16)
It wasn't straightforward to find the right place to clip the array to stop overflow on casting, but I think it's working and tested now.
See y'all,
Matthew
participants (4)
-
Benjamin Root
-
Derek Homeier
-
josef.pktd@gmail.com
-
Matthew Brett