I concur with Stefan. Not having datasets in a package seems like the best way to go. There should be a separate go-to place for datasets (other than minimal ones for test cases). 

I would recommend branching off all datasets... Otherwise we add to Scipy's already significant size. 

On 30/03/2018 at 00:45, Stefan wrote:

On Thu, 29 Mar 2018 15:43:50 -0400, Warren Weckesser wrote:
As a steps towards the deprecation of `misc`, I propose that we create a
new package, `scipy.data`, for holding data sets. `ascent()` and `face()`
would move there, and the new ECG data set proposed in a current pull
request (https://github.com/scipy/scipy/pull/8627) would be put there.

We've been doing this in scikit-image for a long time, and now regret
having any binary data in the repository; we are working on a way of
hosting it outside instead.

Can we standardize on downloader tools? There are examples in
scikit-learn, dipy, and many other packages. We were thinking of a very
lightweight spec + tools for solving this problem a while ago, but never
got very far:

https://github.com/data-pack/data-pack/pull/1/files

Best regards
Stéfan
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@python.org
https://mail.python.org/mailman/listinfo/scipy-dev