[Neuroimaging] ANN: openneuro-py, a new app for downloading OpenNeuro datasets
Yaroslav Halchenko
lists at onerussian.com
Tue Dec 15 11:52:50 EST 2020
On Tue, 15 Dec 2020, Christopher Markiewicz wrote:
> Hi all,
> FWIW almost all public datasets have been pushed to GitHub and can be accessed via datalad (exceptions being tracked on these issues: https://github.com/OpenNeuroOrg/openneuro/issues/1741 and https://github.com/OpenNeuroOrg/openneuro/issues/1743).
> datalad install https://github.com/OpenNeuroDatasets/ds00WXYZ.git
> Datalad makes it pretty straightforward to download only the portions of the data you want.
FWIW, datalad is also accompanied with Python API for all of its
functionality, so analogous command with fetching specific subjects (or
any path you like) would be smth like
$> python3 -c 'import datalad.api as dl; ds = dl.install("https://github.com/OpenNeuroDatasets/ds000001.git"); ds.get([f"sub-{i:02d}" for i in [1,2,3]], jobs=5)'
Similarly it would work for HCP, many INDI, etc datasets. Explore some more on
http://datasets.datalad.org/ and learn more about datalad at
http://handbook.datalad.org/
> This isn't to deprecate Richard's tool.
Agree! DataLad uses git and git-annex -- might be a bit heavy of a dependency
for some use cases. We are working hard though to ensure datalad with all
dependencies be easy to install (windows remains an issue somewhat, but ok for
"downloader" part ;))
BUT
- see https://github.com/nidata/nidata - a similar concept excercised in the
past (back then it was openfmri), and covers more of other data sources.
Unfortunately development stopped. If to pursue an endeavor of a pure
downloader -- might be worth somehow joining forces with prior effort.
- with datalad you get not just a "downloader" but overall content management
not only for "source data" but for results as well. See e.g.
https://github.com/ReproNim/containers/#a-typical-workflow
for an example of prototypical computation workflow, where source data and
results are versioned and "reproducible".
Sorry for a shameless DataLad plug and some grains of salt in my follow up, I
don't want to sound negative and not-supportive, but it is also hard to be
unbiased with my DataLad hat on. And if project to be
created/maintained beyond "an exercise", those points might better be
considered.
Cheers,
--
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
WWW: http://www.linkedin.com/in/yarik
More information about the Neuroimaging
mailing list