[Neuroimaging] Nibabel API change - always read as float

Mon Jul 6 23:19:07 CEST 2015

On Mon, Jul 6, 2015 at 8:15 PM, bthirion <bertrand.thirion at inria.fr> wrote:
>
>
> On 06/07/2015 18:37, Matthew Brett wrote:
>>
>> On Mon, Jul 6, 2015 at 5:32 PM, Bertrand Thirion
>> <bertrand.thirion at inria.fr> wrote:
>>>
>>> +1 we (and more importantly, our students)  should rely as much as
>>> possible
>>> on the standard behavior of numpy arrays and make adequate decisions,
>>> rather
>>> than having to figure out the details of the API of neuroimaging
>>> libraries.
>>> So the defaut should be unchanged.
>>
>> Your reasoning implies the opposite.   Numpy tries very hard not to
>> return arrays of unknown or unpredictable data types, and that is the
>> situation we have here.   The returned datatype from a nibabel image
>> is essentially arbitrary, in that very few sources of nifti files
>> place any weight on whether there are non-default scalefactors or not.
>> At the moment, we do, depend on this, silently, and that is extremely
>> confusing, and quite contrary to the standard numpy way,
>
> Sorry for being unclear, but Numpy would never force casting when loading
> data.
>
> When you get some array, you need to be aware of what it is in order to work
> with it. A mask or label image is not meant to be something on which you
> perform algebraic manipulations. Sure, you can get it wrong if you don't
> know what you're doing, but either this user has to learn it or he/she
> should consider using higher level interfaces to work with images.

`get_data()` was meant to be the higher level interface.

This argument would be a different one if it were always or even
usually true that an integer stored data type and slope of 1 and
intercept of 0 in a NIfTI signaled the intention that the data should
be treated as integers when loaded.  However, that isn't even close to
true.   Just for example, the large majority of functional images from
the scanner are int datatype with slope, intercept of 1, 0, and we
very rarely mean these to be treated as integers.

I don't think it's sensible to try and educate the users out of this
one, because I made this mistake, and I know numpy very well and wrote
the relevant code in nibabel.

I think the dtype argument is OK, it may be better than `asfloat`. It
starts becoming a little complicated having to deal with all possible
output types - for example rounding float to ints is not as
straightforward as it may seem (for example you have to clip the
output so as not to overflow the ints).

Cheers,

Matthew