[scikit-learn] Does NMF optimise over observed values

Mon Aug 29 13:01:57 EDT 2016

If X is sparse, explicit zeros and missing-value zeros are **both**
considered as zeros in the objective functions.

Changing the objective function wouldn't need a new interface, yet I am not
sure the code change would be completely trivial.
The question is: do we want this new objective function in scikit-learn,
since we have no other recommendation-like algorithm?
If we agree that it would useful, feel free to send a PR.

Tom

2016-08-29 17:50 GMT+02:00 Andreas Mueller <t3kcit at gmail.com>:

>
>
> On 08/28/2016 01:16 PM, Raphael C wrote:
>
>
>
> On Sunday, August 28, 2016, Andy <t3kcit at gmail.com> wrote:
>
>>
>>
>> On 08/28/2016 12:29 PM, Raphael C wrote:
>>
>> To give a little context from the web, see e.g. http://www.quuxlabs.com/b
>> log/2010/09/matrix-factorization-a-simple-tutorial-and-
>> implementation-in-python/ where it explains:
>>
>> "
>> A question might have come to your mind by now: if we find two matrices [image:
>> \mathbf{P}] and [image: \mathbf{Q}] such that [image: \mathbf{P} \times
>> \mathbf{Q}] approximates [image: \mathbf{R}], isn’t that our predictions
>> of all the unseen ratings will all be zeros? In fact, we are not really
>> trying to come up with [image: \mathbf{P}] and [image: \mathbf{Q}] such
>> that we can reproduce [image: \mathbf{R}] exactly. Instead, we will only
>> try to minimise the errors of the observed user-item pairs.
>> "
>>
>> Yes, the sklearn interface is not meant for matrix completion but
>> matrix-factorization.
>> There was a PR for some matrix completion for missing value imputation at
>> some point.
>>
>> In general, scikit-learn doesn't really implement anything for
>> recommendation algorithms as that requires a different interface.
>>
>
> Thanks Andy. I just looked up that PR.
>
> I was thinking simply producing a different factorisation optimised only
> over the observed values wouldn't need a new interface. That in itself
> would be hugely useful.
>
> Depends. Usually you don't want to complete all values, but only compute a
> factorization. What do you return? Only the factors?
> The PR implements completing everything, and that you can do with the
> transformer interface. I'm not sure what the status of the PR is,
> but doing that with NMF instead of SVD would certainly also be interesting.
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160829/edf8dd02/attachment.html>