I concur with Stefan. Not having datasets in a package seems like the best way to go. There should be a separate go-to place for datasets (other than minimal ones for test cases).
<br>
<br>I would recommend branching off all datasets... Otherwise we add to Scipy's already significant size.
<br>
<div class="hm_signature"></div>
<br>
<blockquote class="hm_quoted_text" style="padding-left:8px;margin:0;border-left:1px solid rgb(185,185,185);color:rgb(100,100,100)">
<div>
On 30/03/2018 at 00:45, Stefan wrote:
</div>
<p></p>
<span style="white-space:pre-wrap" class="hm_plaintext">
<div class="hm_quote_toggle">
On Thu, 29 Mar 2018 15:43:50 -0400, Warren Weckesser wrote:
<br>
<blockquote type="cite">
As a steps towards the deprecation of `misc`, I propose that we create a
<br>new package, `scipy.data`, for holding data sets. `ascent()` and `face()`
<br>would move there, and the new ECG data set proposed in a current pull
<br>request (<a href="https://github.com/scipy/scipy/pull/8627">https://github.com/scipy/scipy/pull/8627</a>) would be put there.
<br>
</blockquote>
<br>
</div>We've been doing this in scikit-image for a long time, and now regret<br>having any binary data in the repository; we are working on a way of<br>hosting it outside instead.<br><br>Can we standardize on downloader tools? There are examples in<br>scikit-learn, dipy, and many other packages. We were thinking of a very<br>lightweight spec + tools for solving this problem a while ago, but never<br>got very far:<br><br><a href="https://github.com/data-pack/data-pack/pull/1/files">https://github.com/data-pack/data-pack/pull/1/files</a><br><br>Best regards<br>Stéfan
<div class="hm_quote_toggle">
_______________________________________________
<br>SciPy-Dev mailing list
<br><a href="mailto:SciPy-Dev@python.org">SciPy-Dev@python.org</a>
<br><a href="https://mail.python.org/mailman/listinfo/scipy-dev">https://mail.python.org/mailman/listinfo/scipy-dev</a>
<br>
</div></span>
</blockquote>