[scikit-learn] How to get the factorization from NMF in scikit learn

Vlad Niculae zephyr14 at gmail.com
Wed Sep 7 08:32:16 EDT 2016


Hi Raphael,

The other matrix in the factorization is the output of nmf.transform(A).
In your example you forgot to fit the estimator; if you're just
interested in the decomposition the recommended way is to get it in
one line with W = nmf.fit_transform(A).

While the mathematical description doesn't make it immediately
obvious, the scikit-learn API makes a distinction between the two
factors W, H based on whether they're in the samples or the features
direction. W is a representation of the samples in the learned latent
space, shape (n_samples, n_components). Meanwhile, H is a
representation of the features, so it's useful to store it *in the
transformer* in case more samples arise from the same sample
representation (e.g, at test time) and you want to transform them.

HTH,
Vlad

On Wed, Sep 7, 2016 at 8:17 AM, Raphael C <drraph at gmail.com> wrote:
> I am trying to use NMF from scikit learn. Given a matrix A this should
> give me a factorization into matrices W and H so that WH is
> approximately equal to A. As a sanity check I tried the following:
>
> from sklearn.decomposition import NMF
> import numpy as np
> A = np.array([[0,1,0],[1,0,1],[1,1,0]])
> nmf = NMF(n_components=3, init='random', random_state=0)
> print nmf.components_
>
> This gives me a single 3 by 3 matrix as output. What is this
> representing? I want the two matrices W and H from the factorization.
> How can I get these two matrices?
>
> I am sure I am just missing something simple.
>
> Raphael
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


More information about the scikit-learn mailing list