[Neuroimaging] Nibabel API change - always read as float

Mon Jul 6 17:55:08 CEST 2015

I agree, this sounds like good thinking. I might suggest that the parameter
be named "as_float" which is more congruent with "astype(float)", an
operation many will be familiar with.

On Mon, Jul 6, 2015 at 8:46 AM, Blaise Frederick <blaise.frederick at gmail.com
> wrote:

> That seems reasonable.  It might also add clarity to define:
>
> img.get_native_data()
>
> which was just an alias of
>
> img.get_data(to_float=False)
>
> That would have the advantage of making it immediately obvious what the
> code was doing (not that the other way doesn’t).
>
> Blaise
>
> > On Jul 6, 2015, at 11:32 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> >
> > Hi,
> >
> > I wanted to ask y'all about an API change that I want to make to nibabel.
> >
> > In summary, I want to default to returning floating point arrays from
> > nibabel images.
> >
> > Problem - different returned data types from img.get_data()
> >
> -------------------------------------------------------------------------------
> >
> > At the moment, if you do this:
> >
> > img = nib.load('my_image.nii')
> > data = img.get_data()
> >
> > Then the data type (dtype) of the returned data array depends on the
> > values in the header of `my_image.nii`.   Specifically, if the raw
> > on-disk data type is 'np.int16' (it is often is) and the header
> > scalefactor values are default (1 for slope, 0 for intercept) then you
> > will get back an array of the on disk data type - here - np.int16.
> >
> > This is very efficient on memory, but it it's a real trap unless you
> careful.
> >
> > For example, let's say you had a pipeline where you did this:
> >
> > sum = img.get_data().sum()
> >
> > That would work fine most of the time, when the data on disk is
> > floating point, or the scalefactors are not default (1, 0).   Then one
> > day, you get an image with int16 data type on disk and 1, 0
> > scalefactors, and your `sum` calculation silently overflows.    I ran
> > into this when teaching - I had to cast some image arrays to floating
> > point to get sensible answers.
> >
> > Solution
> > -----------
> >
> > I think that the default behavior of nibabel should be to do the thing
> > least likely to trip you up by accident, so - I think in due course,
> > nibabel should always return a floating point array from `get_data()`
> > by default.
> >
> > I propose to add a keyword-only argument to `get_data()` - `to_float`,
> as in:
> >
> > data = img.get_data(to_float=False)  # The current default behavior
> > data = img.get_data(to_float=True)  # Integer arrays automatically
> > cast to float64
> >
> > For this cycle (the nibabel 2.0 series), I propose to raise a warning
> > if you don't pass in an explicit True or False, warning that the
> > default behavior for nibabel 3.0 will change from `to_float=False` to
> > `to_float=True`.
> >
> > The other, more fancy ways of getting the image data would continue as
> > they are, such as:
> >
> > data = np.array(img.dataobj)
> > data = img.dataobj[:]
> >
> > These will both return ints or floats depending on the raw data dtype
> > and the scalefactors.  This is on the basis that people using these
> > will be more advanced and so therefore more likely to want memory
> > efficiency at the expense of having to be careful about the returned
> > data dtype.
> >
> > Does this seem reasonable to y'all?    Thoughts, suggestions?
> >
> > Cheers,
> >
> > Matthew
> > _______________________________________________
> > Neuroimaging mailing list
> > Neuroimaging at python.org
> > https://mail.python.org/mailman/listinfo/neuroimaging
>
> _______________________________________________
> Neuroimaging mailing list
> Neuroimaging at python.org
> https://mail.python.org/mailman/listinfo/neuroimaging
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/neuroimaging/attachments/20150706/463f5e7c/attachment.html>