
Thank you Matti for this response. I completed issue 12481 because in my opinion the format proposal responds to this issue. However, if you think a specific issue is preferable, I can create it. To fully understand the proposed standard, it involves representing multidimensional data that has any type. The only constraint is that each data can be represented by a JSON format. This is of course the case for all pandas types but it can also be one of the following types: a year, a polygon, a URI, a type defined in darwincore or in schemaorg... This means that each library or framework must transform this JSON data into an internal value (e.g. a polygon can be translated into a shapely object). The defined types are described in the NTV Internet-Draft [2].
- How does it handle sharing data? NumPy can handle very large ndarrays, and a read-only container with a shared memory location, like in DLPack [0] seems more natural than a format that precludes sharing data.
Concerning the first question, the purpose of this standard is complementary to what is proposed by DLPack (DLPack offers standard access mechanisms to data in memory, which avoids duplication between frameworks): - the format is a neutral reversible exchange format built on JSON (and therefore with duplication) which can be used independently of any framework. - the data types are numerous and with a broader scope than that offered by DLPack (numeric types only).
- Is there a size limitation either on the data or on the number of dimensions? Could this format represent, for instance, data with more than 100 dimensions, which could not be mapped back to NumPy.
Regarding the second question, no there is no limitation on data size or dimensions linked to the format (JSON does not impose limits on array sizes).
Perhaps, like the Pandas package, it should live outside NumPy for a while until some wider consensus could emerge.
Regarding this initial remark, this is indeed a possible option but it depends on the answer to the question: - does Numpy want to have a neutral JSON exchange format to exchange data with other frameworks (tabular, multidimensional or other)? This is why I am interested in having a better understanding of the needs (see end of the initial email). [2] https://www.ietf.org/archive/id/draft-thomy-json-ntv-02.html#appendix-A