[Neuroimaging] Nibabel API change - always read as float

Mon Jul 6 20:12:27 CEST 2015

I agree that the 'dtype' keyword is the better option. It gives more flexibility and better aligns with numpy.

I am on the fence for what the default should be, but I do lean towards Matthew's argument that avoiding (or not promoting) subtle bugs is more important than efficiency for novice users.

-Brendan

________________________________
From: Neuroimaging [neuroimaging-bounces+moloney=ohsu.edu at python.org] on behalf of Ariel Rokem [arokem at gmail.com]
Sent: Monday, July 06, 2015 10:53 AM
To: Neuroimaging analysis in Python
Subject: Re: [Neuroimaging] Nibabel API change - always read as float

On Mon, Jul 6, 2015 at 9:56 AM, Ben Cipollini <bcipolli at ucsd.edu<mailto:bcipolli at ucsd.edu>> wrote:
How about accepting a dtype parameter, with None meaning using the type determined from load (i.e. current behavior)? This makes it easy to document the 'arbitrary' behavior for the user (in explaining the parameter), to convert to whatever datatype you want (if you know your data or have particular needs like Alex). I also believe this is closer to a numpy semantic, to_float is a new semantic (even if relatively simple).

+1 for a `dtype` kwarg, rather than `as_float`, with the input being a numpy dtype to which things get cast. That will give Alex the flexibility to choose float32, rather than float64, if he doesn't need all that extra precision.

As for what the default should be (None or float32 or float64)... I would not touch that conversation quite yet :)

I think the default here should be whatever your system does when you allocate an empty/zeros numpy array (that's float64 for me). Seems only slightly less arbitrary than the current behavior (I think). As I understand it, to mask with an array, you need to go to bool anyway, so this could be nice for the masking use-case:

    data[nib.load('mask.nii.gz').get_data(dtype=bool)]

If the implementation can avoid going to a memory-consuming dtype along the way, that would have a small advantage relative to a call to `as_type` after loading into memory.

Cheers,

Ariel

On Mon, Jul 6, 2015 at 9:37 AM, Matthew Brett <matthew.brett at gmail.com<mailto:matthew.brett at gmail.com>> wrote:
On Mon, Jul 6, 2015 at 5:32 PM, Bertrand Thirion
<bertrand.thirion at inria.fr<mailto:bertrand.thirion at inria.fr>> wrote:
> +1 we (and more importantly, our students)  should rely as much as possible
> on the standard behavior of numpy arrays and make adequate decisions, rather
> than having to figure out the details of the API of neuroimaging libraries.
> So the defaut should be unchanged.

Your reasoning implies the opposite.   Numpy tries very hard not to
return arrays of unknown or unpredictable data types, and that is the
situation we have here.   The returned datatype from a nibabel image
is essentially arbitrary, in that very few sources of nifti files
place any weight on whether there are non-default scalefactors or not.
At the moment, we do, depend on this, silently, and that is extremely
confusing, and quite contrary to the standard numpy way,

Cheers,

Matthew
_______________________________________________
Neuroimaging mailing list
Neuroimaging at python.org<mailto:Neuroimaging at python.org>
https://mail.python.org/mailman/listinfo/neuroimaging

_______________________________________________
Neuroimaging mailing list
Neuroimaging at python.org<mailto:Neuroimaging at python.org>
https://mail.python.org/mailman/listinfo/neuroimaging

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/neuroimaging/attachments/20150706/ca55240c/attachment.html>