[AstroPy] Projects involving irregularly shaped data
P.Wortmann at skatelescope.org
Fri Oct 9 10:29:23 EDT 2020
Another possible use case you might want to be aware of - for the Square
Kilometre Array (https://www.skatelescope.org/) we are currently in the
early stages of evaluating concrete technologies for dealing with data
exchange within our pipelines. We are expecting heavy I/O workloads and
want to evolve our software quite a bit over the lifetime of the
observatory, so we are considering building around Apache Arrow (or
similar) in-memory data structures.
While most of our "primary" data is likely regularly shaped, there will
definitely be very significant amounts of "secondary" data - such as sky
models, or complex calibration and flagging data at minimum. To us,
Awkward sounds like a very good candidate for dealing with such data.
We will likely be building some prototypes over the next year to see how
we could utilise Awkward in processing. I realise this is not quite what
you were asking for, but might still be worthwhile to get in touch at
(Data Processing Architect, Square Kilometre Array Organisation)
On 07/10/2020 20:59, Jim Pivarski wrote:
> Hi everyone,
> Adrian Price-Whelan recommended that I ask my question here, since it
> would reach a greater number of people involved in astronomical software.
> I'm a developer of Awkward Array
> <https://awkward-array.org/what-is-awkward.html>, a Python package for
> manipulating large, irregularly shaped datasets: arrays with
> variable-length lists, nested records, missing values, or mixed data
> types. The interface is a strict generalization of NumPy: you can slice
> jagged arrays as though they were ordinary multidimensional arrays, and
> there are new functions that only make sense in the context of irregular
> data. Like NumPy, the actual calculations are precompiled loops on
> internally homogeneous arrays, and we're expanding it to include GPUs
> transparently (irregular data on GPUs in a NumPy-like syntax).
> This package was developed for particle physics (variable numbers of
> particles emerging from an array of collision events), but it seems like
> these problems would exist in other fields as well. Right now, we're
> working on a proposal to find data analysis projects that need to deal
> with large, irregularly structured data to see if Awkward Array is
> applicable and if it can be made more useful for them. Ideally, this
> would motivate more interoperability with other scientific Python
> libraries. (We can already use Awkward Arrays in Numba; we're working on
> cuDF, Dask, and Zarr. Adrian also recommended ASDF, which I'm looking
> into now.)
> Does anyone have or know about a data analysis project that is currently
> limited by this combination of large + irregular data? Is anyone
> interested in collaborating?
> Thank you!
> -- Jim
> AstroPy mailing list
> AstroPy at python.org
SKA Organisation is a Private Limited Company by guarantee registered in England and Wales with registered number 07881918. Our registered office is at Jodrell Bank Observatory, Lower Withington, Macclesfield, Cheshire, England, SK11 9FT.
This message is intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to us, and immediately and permanently delete it. Do not use, copy or disclose the information contained in this message or in any attachment.
This email has been scanned for viruses and malware, and may have been automatically archived, by Mimecast Ltd. Although SKA Organisation has taken reasonable precautions to ensure no viruses are present in this email, SKA Organisation cannot accept responsibility for any loss or damage sustained as a result of computer viruses and the recipient must ensure that the email (and attachments) are virus free.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the AstroPy