[scikit-learn] What is the FeatureAgglomeration algorithm?

Raphael C drraph at gmail.com
Thu Jul 26 01:05:21 EDT 2018


Hi,

I am trying to work out what, in precise mathematical terms,
[FeatureAgglomeration][1] does and would love some help. Here is some
example code:


    import numpy as np
    from sklearn.cluster import FeatureAgglomeration
    for S in ['ward', 'average', 'complete']:
        FA = FeatureAgglomeration(linkage=S)
        print(FA.fit_transform(np.array([[-50,6,6,7,], [0,1,2,3]])))

This outputs:



    [[  6.33333333 -50.        ]
     [  2.           0.        ]]
    [[  6.33333333 -50.        ]
     [  2.           0.        ]]
    [[  6.33333333 -50.        ]
     [  2.           0.        ]]

Is it possible to say mathematically how these values have been computed?

Also, what exactly does linkage do and why doesn't it seem to make any
difference which option you choose?

Raphael


  [1]:
http://scikit-learn.org/stable/modules/generated/sklearn.cluster.FeatureAgglomeration.html

PS I also asked at
https://stackoverflow.com/questions/51526616/what-does-featureagglomeration-compute-mathematically-and-when-does-linkage-make
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180726/135c450e/attachment.html>


More information about the scikit-learn mailing list