I’m happy to announce a new releases of FaST-LMM<https://pypi.org/project/fastlmm/> and PySnpTools<https://pypi.org/project/pysnptools/>. (This release been my “work” since I retired last summer.)
The new releases updates both packages to work with the newest version of Pandas, Numpy, and Scikit-learn.
The new FaST-LMM release includes single_snp_scale, which allows FaST-LMM to use a cluster and scale to 1 million individuals. See Kadie and Heckerman, bioRxiv 2018<https://www.…
[View More]biorxiv.org/content/10.1101/154682v2> for background. Similar tools would require 100,000 computers to scale this much, but FaST-LMM needs “only” a cluster of 100 computers. (The code can run on any cluster but to run on a particular cluster we must create a module detailing how to automate batch jobs and move files.)
The new PySnpTools release adds support for cluster-sized data. Including:
* snpreader.SnpGen<https://fastlmm.github.io/PySnpTools/#pysnptools.snpreader.SnpGen>: Generate synthetic SNP data on the fly.
* snpreader.SnpMemMap<https://fastlmm.github.io/PySnpTools/#pysnptools.snpreader.SnpMemMap>: Support larger in-memory data via on-disk memory mapping.
* snpreader.DistributedBed<https://fastlmm.github.io/PySnpTools/#pysnptools.snpreader.DistributedBed>: Split Bed<https://fastlmm.github.io/PySnpTools/#pysnptools.snpreader.Bed>-like data into multiple files for more efficient cluster use
* util.mapreduce1<https://fastlmm.github.io/PySnpTools/#module-pysnptools.util.mapreduce1>: Run loops in parallel on multiple processes, threads, or clusters
* util.filecache<https://fastlmm.github.io/PySnpTools/#module-pysnptools.util.filecache>: Automatically copy files to and from any remote storage.
FaST-LMM and PySnpTools were originally developed and open sourced at Microsoft Research. Active development has now based at https://fastlmm.github.io/.
Roadmap:
I plan to continue working on FaST-LMM and PySnpTools. We’d like to run a giant job on real, rather than synthetic, data. We like to compare it other fast methods that we suspect sacrifice accuracy. I’d like to port it from Python 2 to Python 3. (More todo’s: analyze multiple traits in one run, analyze pairs of DNA locations using the single-DNA-location tools, …)
Contacts:
Email the developers at fastlmm-dev(a)python.org<mailto:fastlmm-dev@python.org>.
Join<mailto:fastlmm-user-join@python.org?subject=Subscribe> the user discussion and announcement list (or use web sign up<https://mail.python.org/mailman3/lists/fastlmm-user.python.org>).
Yours,
Carl
Carl Kadie, Ph.D.
FaST-LMM Team
[View Less]
Greetings,
I'm happy to announce new versions of all our Python packages. The packages are now all compatible with the newest version of Python and pip. The underlying code is now also faster and safer.
Details:
* Bed-Reader<https://pypi.org/project/bed-reader/> - We've replaced our underlying C++ code with Rust code.
* Guarantees no race conditions, making one possible bug impossible.
* Avoids versioning conflicts with other package's use of the C++ OpenMP library.
…
[View More] * Remains multithreaded and very fast.
* PySnpTools<https://pypi.org/project/pysnptools/>
* Adds multithreading to standardization and subsetting making these operations 1.5 to 10 times faster.
* Uses updated Bed-Reader
* FaST-LMM<https://pypi.org/project/fastlmm/>
* Updates install instructions to work with newer "pip" (see below).
* Adds support for Python 3.9
* Uses updated Bed-Reader and PySnpTools
Bed-Reader and PySnpTools continue to work with any distribution of Python.
To install FaST-LMM, start with Miniconda or Anaconda (as before). Then ...
conda install "mkl==2019.4" "scipy" "numpy"
pip install --no-build-isolation fastlmm
Thanks for using FaST-LMM. We're happy to answer questions and help with issues.
Yours,
* Carl
Carl Kadie, Ph.D.
FaST-LMM & PySnpTools Team<https://fastlmm.github.io/>
(Microsoft Research, retired)
https://www.linkedin.com/in/carlk/
Join the FaST-LMM user discussion and announcement list via email<mailto:fastlmm-user-join@python.org?subject=Subscribe> (or use web sign up<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.pyth…>)
[View Less]