
On Tue, Mar 24, 2020 at 12:12 PM Matti Picus <matti.picus@gmail.com> wrote:
On 24/3/20 11:48 am, Francesc Alted wrote:
What I am trying to say is that NumPy should be rather agnostic about providing data types beyond the relatively simple set that already supports. I am suggesting that focusing on providing a way to allow the storage (not only in-memory, but also persisted arrays via .npy/.npz files) of user-defined data types (or any other kind of metadata) and let 3rd party libraries use this machinery to serialize/deserialize them might be a better use of resources.
... Cheers, Francesc
I agree that the goal is to enable user-defined data types, and even make the creation of them from python possible (with some caveats about performance). But I think this should be done in steps, and as the subject line says this is the first step. There are many scary details to work out around the problems of promotion and casting, what to do when the output might overflow, how to mark missing values and more. The question at hand is, as I understand it, one of finding the right way to create a data type object that will enable exactly what you propose. I think this is the correct path, as most large refactor-in-one-step efforts I have seem leave both the old code and the new code in an unusable state for years until the bugs are worked out.
Thanks Matti for clarifying the goals of the NEP; having the sentence "New Datatype System" in the title sounded scary to my ears indeed, and I share your concerns about new code largely undergoing 'beta' stage for long time. Before shutting up, I'll just reiterate that providing pretty shallow machinery for allowing the integration with user-defined data types should avoid big headaches: the simpler, the better. But this is of course up to the maintainers.
As for serialization protocols: I think that is a separate issue. We already have the npy/npz protocol, PEP3118 buffer protocol, and the pickle 5 buffering protocol. Each of them handle user-defined data types in different ways, with differing amounts of success.
Yup, I forgot the buffer protocol an pickle 5. Thanks for reminder. Cheers, -- Francesc Alted