<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Should we have "low memory"/batched version of k_neighbors_graph
and epsilon_neighbors_graph functions? I assume<br>
those instantiate the dense matrix right now.<br>
</p>
<br>
<div class="moz-cite-prefix">On 05/13/2018 10:59 PM, Joel Nothman
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAAkaFLWep1PZaY6e-eDQrtcZgUjLcdqU1+6t6EX+WrJRVM1yPg@mail.gmail.com">
<div dir="ltr">This is quite a common issue with our
implementation of DBSCAN, and improvements to documentation
would be very, very welcome.
<div><br>
</div>
<div>The high memory cost comes from constructing the pairwise
radius neighbors for all points. If using a distance metric
that cannot be indexed with a KD-tree or Ball Tree, this
results in n^2 floats being stored in memory even before the
radius neighbors are computed.<br>
<div><br>
</div>
<div>You have the following strategies available to you
currently:</div>
</div>
<div><br>
</div>
<div>1. Calculate the radius neighborhoods using
radius_neighbors_graph in chunks, so as to avoid all pairs
being calculated and stored at once. This produces a sparse
graph representation, which can be passed into dbscan with
metric='precomputed'. (I've just seen Sebastian suggested the
same.)</div>
<div>2. Reduce the number of samples in your dataset and
represent (near-)duplicate points with sample_weight (i.e. two
identical points would be merged but would have a
sample_weight of 2).</div>
<div><br>
</div>
<div>There is also<span
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"> a
proposal to offer an alternative memory-efficient mode at <a
href="https://github.com/scikit-learn/scikit-learn/pull/6813"
moz-do-not-send="true">https://github.com/scikit-learn/scikit-learn/pull/6813</a>.
Feedback is welcome.</span></div>
<div><span
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"><br>
</span></div>
<div><span
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">Cheers,</span></div>
<div><span
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"><br>
</span></div>
<div><span
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">Joel</span></div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
scikit-learn mailing list
<a class="moz-txt-link-abbreviated" href="mailto:scikit-learn@python.org">scikit-learn@python.org</a>
<a class="moz-txt-link-freetext" href="https://mail.python.org/mailman/listinfo/scikit-learn">https://mail.python.org/mailman/listinfo/scikit-learn</a>
</pre>
</blockquote>
<br>
</body>
</html>