[AstroPy] Projects involving irregularly shaped data
jpivarski at gmail.com
Wed Oct 7 15:59:14 EDT 2020
Adrian Price-Whelan recommended that I ask my question here, since it would
reach a greater number of people involved in astronomical software.
I'm a developer of Awkward Array
<https://awkward-array.org/what-is-awkward.html>, a Python package for
manipulating large, irregularly shaped datasets: arrays with
variable-length lists, nested records, missing values, or mixed data types.
The interface is a strict generalization of NumPy: you can slice jagged
arrays as though they were ordinary multidimensional arrays, and there are
new functions that only make sense in the context of irregular data. Like
NumPy, the actual calculations are precompiled loops on internally
homogeneous arrays, and we're expanding it to include GPUs transparently
(irregular data on GPUs in a NumPy-like syntax).
This package was developed for particle physics (variable numbers of
particles emerging from an array of collision events), but it seems like
these problems would exist in other fields as well. Right now, we're
working on a proposal to find data analysis projects that need to deal with
large, irregularly structured data to see if Awkward Array is applicable
and if it can be made more useful for them. Ideally, this would motivate
more interoperability with other scientific Python libraries. (We can
already use Awkward Arrays in Numba; we're working on cuDF, Dask, and Zarr.
Adrian also recommended ASDF, which I'm looking into now.)
Does anyone have or know about a data analysis project that is currently
limited by this combination of large + irregular data? Is anyone interested
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the AstroPy