[scikit-learn] hierarchical clustering
Roman Yurchak
rth.yurchak at gmail.com
Fri Nov 4 05:28:13 EDT 2016
Hi Jaime,
Alternatively, in scikit learn I think, you could use
hac = AgglomerativeClustering(n_clusters, linkage="ward")
hac.fit(data)
clusters = hac.labels_
there in an example on how to plot a dendrogram from this in
https://github.com/scikit-learn/scikit-learn/pull/3464
AgglomerativeClustering internally calls scikit learn's version of
cut_tree. I would be curious to know whether this is equivalent to
scipy's fcluster.
Roman
On 03/11/16 23:12, Jaime Lopez Carvajal wrote:
> Hi Juan,
>
> The fcluster function was that I needed. I can now proceed from here to
> classify images.
> Thank you very much,
>
> Jaime
>
> On Thu, Nov 3, 2016 at 5:00 PM, Juan Nunez-Iglesias <jni.soma at gmail.com
> <mailto:jni.soma at gmail.com>> wrote:
>
> Hi Jaime,
>
> From /Elegant SciPy/:
>
> """
> The *fcluster* function takes a linkage matrix, as returned by
> linkage, and a threshold, and returns cluster identities. It's
> difficult to know a-priori what the threshold should be, but we can
> obtain the appropriate threshold for a fixed number of clusters by
> checking the distances in the linkage matrix.
>
> from scipy.cluster.hierarchy import fcluster
> n_clusters = 3
> threshold_distance = (Z[-n_clusters, 2] + Z[-n_clusters+1, 2]) / 2
> clusters = fcluster(Z, threshold_distance, 'distance')
>
> """
>
> As an aside, I imagine this question is better placed in the SciPy
> mailing list than scikit-learn (which has its own hierarchical
> clustering API).
>
> Juan.
>
> On Fri, Nov 4, 2016 at 2:16 AM, Jaime Lopez Carvajal
> <jalopcar at gmail.com <mailto:jalopcar at gmail.com>> wrote:
>
> Hi there,
>
> I am trying to do image classification using hierarchical
> clustering.
> So, I have my data, and apply this steps:
>
> from scipy.cluster.hierarchy import dendrogram, linkage
>
> data1 = np.array(data)
> Z = linkage(data, 'ward')
> dendrogram(Z, truncate_mode='lastp', p=12,
> show_leaf_counts=False, leaf_rotation=90.,
> leaf_font_size=12.,show_contracted=True)
> plt.show()
>
> So, I can see the dendrogram with 12 clusters as I want, but I
> dont know how to use this to classify the image.
> Also, I understand that funtion cluster.hierarchy.cut_tree(Z,
> n_clusters), that cut the tree at that number of clusters, but
> again I dont know how to procedd from there. I would like to
> have something like: cluster = predict(Z, instance)
>
> Any advice or direction would be really appreciate,
>
> Thanks in advance, Jaime
>
>
> --
> /*Jaime Lopez Carvajal
> */
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org <mailto:scikit-learn at python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn
> <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org <mailto:scikit-learn at python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn
> <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>
> --
> /*Jaime Lopez Carvajal
> */
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
More information about the scikit-learn
mailing list