[scikit-learn] DBScan freezes my computer !!!

Joel Nothman joel.nothman at gmail.com
Thu May 17 18:02:56 EDT 2018


There are two issues here:

1. We store all radius neighborhoods of all points in memory at once. This
is a problem if each point has a large radius neighborhood. DBSCAN only
requires that you store the radius neighbors of the point you are currently
examining. We could provide a memory-efficient mode that would do so.

2. Given that we store all neighborhoods at once, a brute force nearest
neighbors search will take O(n^2) which can be reduced by chunking the
operation.

Both solutions have patches available already, but not reviewed.


On 18 May 2018 at 00:37, Mauricio Reis <reismc at gmail.com> wrote:

> I'm not used to the terms used here. So I understood that the package had
> memory management, which was removed. But you could make the code available
> with memory management implementations. Is it?! :-)
> The problem is that I do not know what I would do with the code, because I
> only know how to work with the SciKitLearn package ready. :-(
>
> Att.,
> Mauricio Reis
>
> 2018-05-16 20:33 GMT-03:00 Joel Nothman <joel.nothman at gmail.com>:
>
>> Implemented in a previous version of #10280
>> <https://github.com/scikit-learn/scikit-learn/pull/10280>, but removed
>> for now to simplify reviews
>> <https://github.com/scikit-learn/scikit-learn/pull/10280#pullrequestreview-95622713>.
>> If others would like to review #10280, I'm happy to follow up with the
>> changes requested here, which have already been implemented by Aman Dalmia
>> and myself.​
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180518/230586e7/attachment.html>


More information about the scikit-learn mailing list