[scikit-learn] DBScan freezes my computer !!!

Roman Yurchak rth.yurchak at gmail.com
Sun May 13 04:34:42 EDT 2018


Could you please check memory usage while running DBSCAN to make sure 
freezing is due to running out of memory and not to something else?
Which parameters do you run DBSCAN with? Changing algorithm, leaf_size 
parameters and ensuring n_jobs=1 could help.

Assuming eps is reasonable, I think it shouldn't be an issue to run 
DBSCAN on L2 normalized data: using the default euclidean metric, this 
should produce somewhat similar results to clustering not normalized 
data with metric='cosine'.

On 13/05/18 00:20, Andrew Nystrom wrote:
> If you’re l2 norming your data, you’re making it live on the surface of 
> a hypershere. That surface will have a high density of points and may 
> not have areas of low density, in which case the entire surface could be 
> recognized as a single cluster if epsilon is high enough and min 
> neighbors is low enough. I’d suggest not using l2 norm with DBSCAN.
> On Sat, May 12, 2018 at 7:27 AM Mauricio Reis <reismc at gmail.com 
> <mailto:reismc at gmail.com>> wrote:
> 
>     The DBScan "fit" method (in scikit-learn v0.19.1) is freezing my
>     computer without any warning message!
> 
>     I am using WinPython 3.6.5 64 bit.
> 
>     The method works normally with the original data, but freezes when I
>     use the normalized data (between 0 and 1).
> 
>     What should I do?
> 
>     Att.,
>     Mauricio Reis
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
> 
> 
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
> 



More information about the scikit-learn mailing list