[scikit-learn] What is the FeatureAgglomeration algorithm?

Gael Varoquaux gael.varoquaux at normalesup.org
Thu Jul 26 01:19:45 EDT 2018


FeatureAgglomeration uses the Ward, complete linkage, or average linkage,
algorithms, depending on the choice of "linkage". These are well
documented in the literature, or on wikipedia.

Gaël

On Thu, Jul 26, 2018 at 06:05:21AM +0100, Raphael C wrote:
> Hi,

> I am trying to work out what, in precise mathematical terms,
> [FeatureAgglomeration][1] does and would love some help. Here is some example
> code:


>     import numpy as np
>     from sklearn.cluster import FeatureAgglomeration
>     for S in ['ward', 'average', 'complete']:
>         FA = FeatureAgglomeration(linkage=S)
>         print(FA.fit_transform(np.array([[-50,6,6,7,], [0,1,2,3]])))

> This outputs:

>    

>     [[  6.33333333 -50.        ]
>      [  2.           0.        ]]
>     [[  6.33333333 -50.        ]
>      [  2.           0.        ]]
>     [[  6.33333333 -50.        ]
>      [  2.           0.        ]]

> Is it possible to say mathematically how these values have been computed?

> Also, what exactly does linkage do and why doesn't it seem to make any
> difference which option you choose?

> Raphael


>   [1]: http://scikit-learn.org/stable/modules/generated/
> sklearn.cluster.FeatureAgglomeration.html

> PS I also asked at 
> https://stackoverflow.com/questions/51526616/
> what-does-featureagglomeration-compute-mathematically-and-when-does-linkage-make


> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


-- 
    Gael Varoquaux
    Senior Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux


More information about the scikit-learn mailing list