
I'm pleased to announce the release of the latest major version of xarray, v0.9. xarray is an open source project and Python package that provides a toolkit and data structures for N-dimensional labeled arrays. Its approach combines an API inspired by pandas with the Common Data Model for self-described scientific data. This release includes five months worth of enhancements and bug fixes from 24 contributors, including some significant enhancements to the data model that are not fully backwards compatible. Highlights include: - Coordinates are now optional in the xarray data model, even for dimensions. - Changes to caching, lazy loading and pickling to improve xarray’s experience for parallel computing. - Improvements for accessing and manipulating pandas.MultiIndex levels. - Many new methods and functions, including quantile(), cumsum(), cumprod(), combine_firstset_index(), reset_index(), reorder_levels(), full_like(), zeros_like(), ones_like(), open_dataarray(), compute(), Dataset.info(), testing.assert_equal(), testing.assert_identical(), and testing.assert_allclose(). For more details, read the full release notes: http://xarray.pydata.org/en/latest/whats-new.html You can install xarray with pip or conda: pip install xarray conda install -c conda-forge xarray Best, Stephan

On 1 Feb 2017, at 05:19, Stephan Hoyer <shoyer@gmail.com> wrote:
This release includes five months worth of enhancements and bug fixes from 24 contributors, including some significant enhancements to the data model that are not fully backwards compatible.
Looks very nice; is the API stable or are you waiting for a v1.0 release? Is there significant overhead compared to plain ndarray?

On Wed, Feb 1, 2017 at 12:55 AM, Marmaduke Woodman <mmwoodman@gmail.com> wrote:
Looks very nice; is the API stable or are you waiting for a v1.0 release?
We are pretty close to full API stability but not quite there yet. Enough people are using xarray in production that breaking changes are made with serious caution (and deprecation cycles whenever feasible). The only major backwards-incompatible change planned is an overhaul of indexing to use labeled broadcasting and alignment: https://github.com/pydata/xarray/issues/974 There are a few other "nice to have" features for v1.0 but that's the only one that has the potential to change functionality in a way that we can't cleanly deprecate.
Is there significant overhead compared to plain ndarray?
Xarray is implemented in Python (not C), so it does have significant overhead for every operation. Adding two arrays takes ~100 us, rather than <1 us in NumPy. So you don't want to use it in your inner loop. That said, the overhead is independent of the size of the array. So if you work with large arrays, it is negligible.

On 1 Feb 2017, at 05:19, Stephan Hoyer <shoyer@gmail.com> wrote:
This release includes five months worth of enhancements and bug fixes from 24 contributors, including some significant enhancements to the data model that are not fully backwards compatible.
Looks very nice; is the API stable or are you waiting for a v1.0 release? Is there significant overhead compared to plain ndarray?

On Wed, Feb 1, 2017 at 12:55 AM, Marmaduke Woodman <mmwoodman@gmail.com> wrote:
Looks very nice; is the API stable or are you waiting for a v1.0 release?
We are pretty close to full API stability but not quite there yet. Enough people are using xarray in production that breaking changes are made with serious caution (and deprecation cycles whenever feasible). The only major backwards-incompatible change planned is an overhaul of indexing to use labeled broadcasting and alignment: https://github.com/pydata/xarray/issues/974 There are a few other "nice to have" features for v1.0 but that's the only one that has the potential to change functionality in a way that we can't cleanly deprecate.
Is there significant overhead compared to plain ndarray?
Xarray is implemented in Python (not C), so it does have significant overhead for every operation. Adding two arrays takes ~100 us, rather than <1 us in NumPy. So you don't want to use it in your inner loop. That said, the overhead is independent of the size of the array. So if you work with large arrays, it is negligible.
participants (2)
-
Marmaduke Woodman
-
Stephan Hoyer