[Neuroimaging] Nibabel API change - always read as float

Blaise Frederick blaise.frederick at gmail.com
Mon Jul 6 17:46:54 CEST 2015


That seems reasonable.  It might also add clarity to define:

img.get_native_data()

which was just an alias of

img.get_data(to_float=False)

That would have the advantage of making it immediately obvious what the code was doing (not that the other way doesn’t).

Blaise

> On Jul 6, 2015, at 11:32 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> 
> Hi,
> 
> I wanted to ask y'all about an API change that I want to make to nibabel.
> 
> In summary, I want to default to returning floating point arrays from
> nibabel images.
> 
> Problem - different returned data types from img.get_data()
> -------------------------------------------------------------------------------
> 
> At the moment, if you do this:
> 
> img = nib.load('my_image.nii')
> data = img.get_data()
> 
> Then the data type (dtype) of the returned data array depends on the
> values in the header of `my_image.nii`.   Specifically, if the raw
> on-disk data type is 'np.int16' (it is often is) and the header
> scalefactor values are default (1 for slope, 0 for intercept) then you
> will get back an array of the on disk data type - here - np.int16.
> 
> This is very efficient on memory, but it it's a real trap unless you careful.
> 
> For example, let's say you had a pipeline where you did this:
> 
> sum = img.get_data().sum()
> 
> That would work fine most of the time, when the data on disk is
> floating point, or the scalefactors are not default (1, 0).   Then one
> day, you get an image with int16 data type on disk and 1, 0
> scalefactors, and your `sum` calculation silently overflows.    I ran
> into this when teaching - I had to cast some image arrays to floating
> point to get sensible answers.
> 
> Solution
> -----------
> 
> I think that the default behavior of nibabel should be to do the thing
> least likely to trip you up by accident, so - I think in due course,
> nibabel should always return a floating point array from `get_data()`
> by default.
> 
> I propose to add a keyword-only argument to `get_data()` - `to_float`, as in:
> 
> data = img.get_data(to_float=False)  # The current default behavior
> data = img.get_data(to_float=True)  # Integer arrays automatically
> cast to float64
> 
> For this cycle (the nibabel 2.0 series), I propose to raise a warning
> if you don't pass in an explicit True or False, warning that the
> default behavior for nibabel 3.0 will change from `to_float=False` to
> `to_float=True`.
> 
> The other, more fancy ways of getting the image data would continue as
> they are, such as:
> 
> data = np.array(img.dataobj)
> data = img.dataobj[:]
> 
> These will both return ints or floats depending on the raw data dtype
> and the scalefactors.  This is on the basis that people using these
> will be more advanced and so therefore more likely to want memory
> efficiency at the expense of having to be careful about the returned
> data dtype.
> 
> Does this seem reasonable to y'all?    Thoughts, suggestions?
> 
> Cheers,
> 
> Matthew
> _______________________________________________
> Neuroimaging mailing list
> Neuroimaging at python.org
> https://mail.python.org/mailman/listinfo/neuroimaging



More information about the Neuroimaging mailing list