I concur with Stefan. Not having datasets in a package seems like the best way to go. There should be a separate go-to place for datasets (other than minimal ones for test cases). I would recommend branching off all datasets... Otherwise we add to Scipy's already significant size. On 30/03/2018 at 00:45, Stefan wrote: On Thu, 29 Mar 2018 15:43:50 -0400, Warren Weckesser wrote: As a steps towards the deprecation of `misc`, I propose that we create a new package, `scipy.data`, for holding data sets. `ascent()` and `face()` would move there, and the new ECG data set proposed in a current pull request (https://github.com/scipy/scipy/pull/8627) would be put there. We've been doing this in scikit-image for a long time, and now regret having any binary data in the repository; we are working on a way of hosting it outside instead. Can we standardize on downloader tools? There are examples in scikit-learn, dipy, and many other packages. We were thinking of a very lightweight spec + tools for solving this problem a while ago, but never got very far: https://github.com/data-pack/data-pack/pull/1/files Best regards Stéfan _______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev