[scikit-learn] using a mask for brain images

Fri May 5 07:18:42 EDT 2017

On Fri, May 05, 2017 at 12:13:08PM +0200, Alle Meije Wink wrote:
> Thanks for that Gael - I do know nilearn but in this case I did a depth-first
> search on doing SVM on brain images and ended up here :)

:). Darn, we need to work on our search engine optimization. Nilearn
should be the easiest way of doing SVMs on nifti images.

> The size of 'coef' is (1,205739), the size of mask[mask].T is (205739,)

> Same number of elements, different storage layout(?).

Correct.

> Turned out that the numpy.ravel() function -pointed out by my colleague- can
> solve that!

> >>> wmap[mask]=np.ravel(coef)

I suggested using coef[0] rather than ravel as ravel is more dangerous
(it will flatten brutally the array). But it will work too.

Cheers,

Gaël

> bw
> Alle Meije

> On 5 May 2017 at 07:59, Gael Varoquaux <gael.varoquaux at normalesup.org> wrote:

>     Hi Alle,

>     I think that what has changed between 2014 and today is that the
>     coefficients coef are now a 2D array (number of hyperplanes x number of
>     features). In your case, the first direction is of length one, so you
>     could just do:

>     coef = clf.coef_[0]

>     and your script should work.

>     The code of the Abraham paper cannot be updated, because papers are not
>     living objects. However, we are maintaining a package that encodes all
>     these patterns in higher-level construct: http://nilearn.github.io/
>     It might be a good idea to use this package, as it is maintained and has
>     quality assurance.

>     Best,

>     Gaël

>     On Thu, May 04, 2017 at 05:52:27PM +0200, Alle Meije Wink wrote:
>     > I have a script to classify MRI perfusion maps from healthy subjects and
>     > patients. For the file IO and the classifier I have started with the
>     example
>     > code in Abraham et al 2014 [https://arxiv.org/pdf/1412.3919.pdf].

>     > I use the same classifier as in the paper to produce a back-projected map
>     of
>     > classification weights, which I then want to 'unmask' like in the paper:

>     >     coef=clf.coef_
>     >     coef=featureselection.inverse_transform(coef)          

>     > and

>     >     map_name='weights_check.nii.gz'
>     >     wmap=np.zeros(mask.shape, dtype=X.dtype)
>     >     wmap[mask]=coef
>     >     img=nb.Nifti1Image(wmap,np.eye(4))
>     >     img.to_filename(map_name)   

>     > But the line "wmap[mask]=coef" throws an error "ValueError: boolean index
>     array
>     > should have 1 dimension". I tried the example code from the paper and
>     that
>     > works.

>     > Is the 'coef' array of back-projected SVM weights in some way different
>     than
>     > the masked input image? Or am I doing something else wrong? The error
>     suggests
>     > that the mask array is the problem.

>     > The complete script is attached.

>     > Many thanks for your help!
>     > Alle Meije

>     > # -*- coding: utf-8 -*-
>     > """
>     > classify MR images from controls (CON) and patients (MCI)

>     > """

>     > import os
>     > import numpy as np
>     > import nibabel as nb
>     > import sklearn as sl

>     > from sklearn.feature_selection import f_classif
>     > from sklearn import svm

>     > def subj_lists( rootdir='/data/a.wink/MR', starts=['CON','MCI'] ):

>     >     # make and empty list for each patient group
>     >     slists=[]
>     >     for i in range(0,len(starts)):
>     >         slists.append([]);

>     >     # add subjects to the group based on
>     >     for root, dirs, files in os.walk(rootdir):
>     >         for fname in files:
>     >             for i in range(0,len(starts)):
>     >                 if fname.startswith(starts[i]):
>     >                     slists[i].append(os.path.join(root,fname))

>     >     print('finished building subject lists')
>     >     return slists

>     > def mk_mask ( subj_lists=[] ):

>     >     # check whether mask can be loaded
>     >     mname='mask_check.nii.gz'

>     >     if os.path.isfile(mname):

>     >         msk=nb.load(mname).get_data()

>     >     # or must be built (values >20% in the subject images and >20% grey
>     matter in the template)
>     >     else:

>     >         num_im=0;
>     >         for i in range(0,len(subj_lists)):
>     >             for fname in subj_lists[i]:
>     >                 imdata=nb.load(fname).get_data()
>     >                 if 'msk' not in locals():
>     >                     msk=np.absolute(imdata)
>     >                 else:
>     >                     msk=msk+np.absolute(imdata)
>     >                 num_im+=1

>     >         msk/=num_im
>     >         msk/=np.amax(msk)
>     >         msk[msk<.2]=0
>     >         msk[msk>0]=1

>     >         grey=nb.load('/usr/local/spm8/apriori/grey.nii').get_data()
>     >         grey[grey<.2]=0
>     >         grey[grey>0]=1

>     >         msk=msk*grey

>     >         img=nb.Nifti1Image(msk,np.eye(4))
>     >         img.to_filename(mname)

>     >     msk=msk.astype(bool)

>     >     print('finished building mask')
>     >     return msk, mname

>     > def load_images( subj_lists=[], mask=[] ):

>     >     # load all the images, build a matrix X or subjects (rows) *
>     inmask-voxels (columns)
>     >     num_im=0;
>     >     for i in range(0,len(subj_lists)):
>     >         for fname in subj_lists[i]:
>     >             imdata=nb.load(fname).get_data()
>     >             num_im+=1
>     >             print '\r' + str(num_im) +'\r',
>     >             if 'the_matrix' not in locals():
>     >                 the_matrix=imdata[mask].T
>     >             else:
>     >                 the_matrix=np.vstack((the_matrix,imdata[mask].T))

>     >     print('finished building matrix X')
>     >     return the_matrix

>     > def main():

>     >     # get the input data
>     >     subjlists = subj_lists()
>     >     y=np.concatenate( ([-1 for _ in range(len(subjlists[0]))],
>     >                        [ 1 for _ in range(len(subjlists[1]))]) )
>     >     mask, mas_name = mk_mask(subjlists)
>     >     X = load_images(subjlists,mask)
>     >     print "number of subjects, voxels: %d, %d" % X.shape

>     >     # select features
>     >     featureselection=sl.feature_selection.SelectKBest(f_classif, k=8000)
>     >     X_reduced=featureselection.fit_transform(X,y)

>     >     # classify
>     >     clf=svm.SVC(kernel='linear')
>     >     clf.fit(X_reduced,y)

>     >     # make discrimination map
>     >     coef=clf.coef_
>     >     coef=featureselection.inverse_transform(coef)

>     >     map_name='weights_check.nii.gz'
>     >     wmap=np.zeros(mask.shape, dtype=X.dtype)
>     >     wmap[mask]=coef
>     >     img=nb.Nifti1Image(wmap,np.eye(4))
>     >     img.to_filename(map_name)

>     > if __name__ == "__main__":
>     >     main()

>     > _______________________________________________
>     > scikit-learn mailing list
>     > scikit-learn at python.org
>     > https://mail.python.org/mailman/listinfo/scikit-learn
-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux