Would a separate repo scipy-datasets help ? Then something like

try:
     importing
except:
    warn('I'm off to interwebz')
    download from the repo

might be feasible. The download part can either be that particular dataset or the whole scipy-datasets clone.




On Fri, Mar 30, 2018 at 1:16 AM, Stefan van der Walt <stefanv@berkeley.edu> wrote:
On Thu, 29 Mar 2018 18:54:52 -0400, Warren Weckesser wrote:
> Can you summarize the problems that make you regret including the
> data?

- The size of the repository (extra time on each clone, and that for
  data that isn't necessary in most use cases)

- Artificial limit on data sizes: we now have a default place to store
  data, but we still need an additional mechanism for larger datasets.
  How do you choose the threshold for what goes in, what is too big?

- Because these tiny embedded datasets are easily available, they become
  the default for demos.  If data is stored externally, realistic
  examples become more feasible and likely.

Best regards
Stéfan
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@python.org
https://mail.python.org/mailman/listinfo/scipy-dev