[scikit-learn] OMP ended prematurely due to linear dependence in the dictionary

Vlad Niculae zephyr14 at gmail.com
Fri Feb 17 20:01:32 EST 2017


Oh I'm inclined to say this isn't a bug then. Your residuals can
simply be low enough to trigger early stopping this way. Although I
agree the warning could be improved.

However, if it IS the case that plugging in 32bit X and 64bit y leads
to *different results* than if both have the same dtype (all other
things being equal) than that would be a bug. (even if the different
results don't consist in an unwanted early stopping.) Is this the
case?

On Fri, Feb 17, 2017 at 7:53 PM, Benjamin Merkt
<benjamin.merkt at bcf.uni-freiburg.de> wrote:
> While trying to get a minimal example to reproduce the error I found that
> there it also occurred when both arrays where float64. However, I then
> realized that my data vector has fairly small values (~1e-4 to 1e-8). If I
> normalize this as well it works for all combinations of 64 and 32 bit.
>
> -Ben
>
>
>
> On 17.02.2017 01:56, Vlad Niculae wrote:
>>
>> I would consider this a bug. I'm not 100% sure what the conventions
>> for dtypes are. I'd appreciate it if you could open an issue, and even
>> better if you have a small reproducing example. I'll look into it this
>> weekend.
>>
>> Vlad
>>
>> On Fri, Feb 17, 2017 at 7:25 AM, Benjamin Merkt
>> <benjamin.merkt at bcf.uni-freiburg.de> wrote:
>>>
>>> Is this still considered a bug and therefore worth an issue?
>>>
>>>
>>> On 14.02.2017 13:34, Benjamin Merkt wrote:
>>>>
>>>>
>>>> Yes, the data array y was already float64.
>>>>
>>>>
>>>> On 14.02.2017 12:28, Vlad Niculae wrote:
>>>>>
>>>>>
>>>>> One possible issue I can see causing this is if X and y have different
>>>>> dtypes... was this the case for you?
>>>>>
>>>>> On Tue, Feb 14, 2017 at 8:26 PM, Vlad Niculae <zephyr14 at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Ben,
>>>>>>
>>>>>> This actually sounds like a bug in this case! At a glance, the code
>>>>>> should use the correct BLAS calls for the data type you provide. Can
>>>>>> you reproduce this with a simple small example that gets different
>>>>>> results if the data is 32 vs 64 bit? Would you mind filing an issue?
>>>>>>
>>>>>> Thanks,
>>>>>> Vlad
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 14, 2017 at 8:19 PM, Benjamin Merkt
>>>>>> <benjamin.merkt at bcf.uni-freiburg.de> wrote:
>>>>>>>
>>>>>>>
>>>>>>> OK, the issue is resolved. My dictionary was still in 32bit float
>>>>>>> from
>>>>>>> saving. When I convert it to 64float before calling fit it works
>>>>>>> fine.
>>>>>>>
>>>>>>> Sorry to bother.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 14.02.2017 11:00, Benjamin Merkt wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I tried that with no effect. The fit still breaks after two
>>>>>>>> iterations.
>>>>>>>>
>>>>>>>> If I set precompute=True I get three coefficients instead of only
>>>>>>>> two.
>>>>>>>> My Dictionary is fairly large (currently 128x42000). Is it even
>>>>>>>> feasible
>>>>>>>> to use OMP with such a big Matrix (even with ~120GB ram)?
>>>>>>>>
>>>>>>>> -Ben
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 13.02.2017 23:31, Vlad Niculae wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Are the columns of your matrix normalized? Try setting
>>>>>>>>> `normalized=True`.
>>>>>>>>>
>>>>>>>>> Yours,
>>>>>>>>> Vlad
>>>>>>>>>
>>>>>>>>> On Mon, Feb 13, 2017 at 6:55 PM, Benjamin Merkt
>>>>>>>>> <benjamin.merkt at bcf.uni-freiburg.de> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi everyone,
>>>>>>>>>>
>>>>>>>>>> I'm using OrthogonalMatchingPursuit to get a sparse coding of a
>>>>>>>>>> signal using
>>>>>>>>>> a dictionary learned by a KSVD algorithm (pyksvd). However, during
>>>>>>>>>> the fit I
>>>>>>>>>> get the following RuntimeWarning:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /usr/local/lib/python2.7/dist-packages/sklearn/linear_model/omp.py:391:
>>>>>>>>>>
>>>>>>>>>> RuntimeWarning:  Orthogonal matching pursuit ended prematurely
>>>>>>>>>> due to
>>>>>>>>>> linear
>>>>>>>>>> dependence in the dictionary. The requested precision might not
>>>>>>>>>> have
>>>>>>>>>> been
>>>>>>>>>> met.
>>>>>>>>>>
>>>>>>>>>>   copy_X=copy_X, return_path=return_path)
>>>>>>>>>>
>>>>>>>>>> In those cases the results are indeed not satisfactory. I don't
>>>>>>>>>> get the
>>>>>>>>>> point of this warning as it is common in sparse coding to have an
>>>>>>>>>> overcomplete dictionary an thus also linear dependency within it.
>>>>>>>>>> That
>>>>>>>>>> should not be an issue for OMP. In fact, the warning is also
>>>>>>>>>> raised
>>>>>>>>>> if the
>>>>>>>>>> dictionary is a square matrix.
>>>>>>>>>>
>>>>>>>>>> Might this Warning also point to other issues in the application?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks, Ben
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> scikit-learn mailing list
>>>>>>>>>> scikit-learn at python.org
>>>>>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> scikit-learn mailing list
>>>>>>>>> scikit-learn at python.org
>>>>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> scikit-learn mailing list
>>>>>>>> scikit-learn at python.org
>>>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> scikit-learn mailing list
>>>>>>> scikit-learn at python.org
>>>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> scikit-learn mailing list
>>>>> scikit-learn at python.org
>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>
>>>> _______________________________________________
>>>> scikit-learn mailing list
>>>> scikit-learn at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


More information about the scikit-learn mailing list