<br><br>On Monday, August 29, 2016, Andreas Mueller <<a href="mailto:t3kcit@gmail.com">t3kcit@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<br>
<br>
<div>On 08/28/2016 01:16 PM, Raphael C
wrote:<br>
</div>
<blockquote type="cite"><br>
<br>
On Sunday, August 28, 2016, Andy <<a href="javascript:_e(%7B%7D,'cvml','t3kcit@gmail.com');" target="_blank">t3kcit@gmail.com</a>> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> <br>
<br>
<div>On 08/28/2016 12:29 PM, Raphael C wrote:<br>
</div>
<blockquote type="cite">To give a little context from the web,
see e.g. <a href="http://www.quuxlabs.com/blog/2010/09/matrix-factorization-a-simple-tutorial-and-implementation-in-python/" target="_blank">http://www.quuxlabs.com/b<wbr>log/2010/09/matrix-factorizati<wbr>on-a-simple-tutorial-and-<wbr>implementation-in-python/</a> <wbr>where
it explains:
<div><br>
</div>
<div>"</div>
<div><font size="2"><span style="background-color:rgba(255,255,255,0)">A
question might have come to your mind by now: if we
find two matrices <img src="http://www.quuxlabs.com/wp-content/latex/ccf/ccf6cb7a07e53d6a5c3e8449ae73d371-ffffff-000000-0.png" alt="\mathbf{P}" title="\mathbf{P}" style="margin:0px;padding:0px;vertical-align:middle;max-width:640px"> and <img src="http://www.quuxlabs.com/wp-content/latex/5e1/5e1ad0579fc06ddcbda6abaa092b7382-ffffff-000000-0.png" alt="\mathbf{Q}" title="\mathbf{Q}" style="margin:0px;padding:0px;vertical-align:middle;max-width:640px"> such
that <img src="http://www.quuxlabs.com/wp-content/latex/4e3/4e37888e71add225aafff9e943e66b88-ffffff-000000-0.png" alt="\mathbf{P} \times \mathbf{Q}" title="\mathbf{P}
\times \mathbf{Q}" style="margin:0px;padding:0px;vertical-align:middle;max-width:640px"> approximates <img src="http://www.quuxlabs.com/wp-content/latex/e1f/e1fd601dbae82a538d518550acb1af19-ffffff-000000-0.png" alt="\mathbf{R}" title="\mathbf{R}" style="margin:0px;padding:0px;vertical-align:middle;max-width:640px">,
isn’t that our predictions of all the unseen ratings
will all be zeros? In fact, we are not really trying
to come up with <img src="http://www.quuxlabs.com/wp-content/latex/ccf/ccf6cb7a07e53d6a5c3e8449ae73d371-ffffff-000000-0.png" alt="\mathbf{P}" title="\mathbf{P}" style="margin:0px;padding:0px;vertical-align:middle;max-width:640px"> and <img src="http://www.quuxlabs.com/wp-content/latex/5e1/5e1ad0579fc06ddcbda6abaa092b7382-ffffff-000000-0.png" alt="\mathbf{Q}" title="\mathbf{Q}" style="margin:0px;padding:0px;vertical-align:middle;max-width:640px"> such
that we can reproduce <img src="http://www.quuxlabs.com/wp-content/latex/e1f/e1fd601dbae82a538d518550acb1af19-ffffff-000000-0.png" alt="\mathbf{R}" title="\mathbf{R}" style="margin:0px;padding:0px;vertical-align:middle;max-width:640px"> exactly.
Instead, we will only try to minimise the errors of
the observed user-item pairs. </span></font></div>
<div><font size="2"><span>"</span></font><br>
</div>
</blockquote>
Yes, the sklearn interface is not meant for matrix completion
but matrix-factorization.<br>
There was a PR for some matrix completion for missing value
imputation at some point.<br>
<br>
In general, scikit-learn doesn't really implement anything for
recommendation algorithms as that requires a different
interface.<br>
</div>
</blockquote>
<div><br>
</div>
<div>Thanks Andy. I just looked up that PR.</div>
<div><br>
</div>
<div>I was thinking simply producing a different factorisation
optimised only over the observed values wouldn't need a new
interface. That in itself would be hugely useful.</div>
</blockquote>
Depends. Usually you don't want to complete all values, but only
compute a factorization. What do you return? Only the factors?</div><div bgcolor="#FFFFFF" text="#000000"><br>
The PR implements completing everything, and that you can do with
the transformer interface. I'm not sure what the status of the PR
is,<br>
but doing that with NMF instead of SVD would certainly also be
interesting.</div></blockquote><div><br></div><div>I was thinking you would literally return W and H so that WH approx X. The user can then decide what to do with the factorisation just like when doing SVD.</div><div><br></div><div>Raphael </div><div><br></div><div> </div>