[Neuroimaging] iteraxis API - we need feedback

Sat Sep 5 05:02:53 CEST 2015

Hi,

On Sat, Sep 5, 2015 at 3:22 AM, Satrajit Ghosh <satra at mit.edu> wrote:
> hi matthew,
>
>>
>> for vol in img.iteraxis(3):  # iterate over 4th axis
>>     # do something with vol
>>
>> where `iteraxis` would use `databobj` slicing under the hood.
>>
>> The questions are:
>>
>> * should this be a method on the image (`img.iteraxis`), the dataobj
>> (`img.dataobj.iteraxis`) or should it be a standalone function that
>> knows about arrays and array proxies? (`nibabel.iteraxis`);
>
>
> img.iteraxis seems like a good place.
>
>>
>> * how should the iterator optimize speed or memory?   Should this be
>> configurable?  For example, if you are iterating over the first axis
>> of a Nifti, then it will probably be most efficient to read all the
>> data into memory and return the slices from the numpy array.   This
>> will be very expensive in memory.   If a file is compressed, it may be
>> most efficient to uncompress the file and use the uncompressed version
>> with `dataobj` file slicing - but this will involve a temporary file
>> that may be very large.   Options are:
>>
>>     * find some heuristic to chose joint optimization for memory and
>> speed;
>>     * always optimize for memory;
>>     * always optimize for speed, saving memory where possible;
>>     * have a tuning kwarg selecting between these options.
>
>
> i don't know if there is a common heuristic - it really depends on the data
> characteristics as well as the system configuration.
>
>>
>> The upside of image.iteraxis would be to embed knowledge we've gained
>> on these objects and simplify the interface for users.
>
>
> could you please clarify what you mean by "these objects"?

Sorry - I wasn't being clear - I mean knowledge of the dataobj objects.

>> Any thoughts?   Use-cases?
>
>
> thoughts/questions:
>
> - would iteraxis be for volume only or support surface and streamline
> formats?

The obvious case is axes of arrays.  I guess, when we've worked those
out, we can see if an 'axis' makes sense for something like
streamlines.

> - recommend testing these with hcp data. they are closer to resolution and
> size of what most datasets will look like in 5 years.
> - stay away from labels for axes or dimensions - this would be dependent on
> phase encoding direction (for epi images) as well as placement of object in
> the scanner. i think nibabel should not have to figure that out. if during
> construction the user labels these axes, then nibabel could use that
> information.

We can work out the meaning of some axes - such as time - and the
Nifti format +/- the json extension can give us more information if
stored.   Guessing would certainly be a bad idea.

> - [forget i'm saying this, but this is a general solution to the
> optimization problem] one could just change the format and store nii as an
> hdf5 dataset and you get both memory and speed optimization!

Sure - one day I suppose no-one will be using Nifti format, and on
that day - er - wait -
I've forgotten what you were saying :)

Cheers,

Matthew