Dear numpy developers,
I would like to share a proposal on making ndarray JSON serializable by default, as detailed in this github issue:
https://github.com/numpy/numpy/issues/20461
briefly, my group and collaborators are working on a new NIH
(National Institute of Health) funded initiative - NeuroJSON
(http://neurojson.org) - to further disseminate a lightweight data
annotation specification (JData)
among the broad neuroimaging/scientific community. Python and
numpy have been widely
used in neuroimaging data analysis pipelines (nipy, nibabel,
mne-python, PySurfer ... ), because N-D array is THE most
important data structure used in scientific data. However, numpy
currently does not support JSON serialization by default. This is
one of the frequently requested features on github (#16432,
#12481).
We have developed a lightweight python modules (jdata, bjdata) to help export/import ndarray objects to/from JSON (and a binary JSON format - BJData/UBJSON - to gain efficiency). The approach is to convert ndarray objects to a dictionary with subfields using standardized JData annotation tags. The JData spec can serialize complex data structures such as N-D arrays (solid, sparse, complex). trees, graphs, tables etc. It also permits data compression. These annotations have been implemented in my MATLAB toolbox - JSONLab - since 2011 to help import/export MATLAB data types, and have been broadly used among MATLAB/GNU Octave users.
Examples of these portable JSON annotation tags representing N-D arrays can be found at
http://openjdata.org/wiki/index.cgi?JData/Examples/Basic#2_D_arrays_in_the_annotated_format
http://openjdata.org/wiki/index.cgi?JData/Examples/Advanced
and the detailed formats on N-D array annotations can be found in
the spec:
our current python module to encode/decode ndarray to JSON
serializable forms are implemented in these compact functions
(handling lossless type/data conversion and data compression)
https://github.com/NeuroJSON/pyjdata/blob/63301d41c7b97fc678fa0ab0829f76c762a16354/jdata/jdata.py#L72-L97
https://github.com/NeuroJSON/pyjdata/blob/63301d41c7b97fc678fa0ab0829f76c762a16354/jdata/jdata.py#L126-L160
We strongly believe that enabling JSON serialization by default
will benefit the numpy user community, making it a lot easier to
share complex data between platforms
(MATLAB/Python/C/FORTRAN/JavaScript...) via a
standardized/NIH-backed data annotation scheme.
We are happy to hear your thoughts, suggestions on how to
contribute, and also glad to set up dedicated discussions.
Cheers
Qianqian