Martin v. Löwis wrote:
Travis Oliphant schrieb:
r_field = PyDict_GetItemString(dtype,'r');
Actually it should read PyDict_GetItemString(dtype->fields). The r_field is a tuple (data-type object, offset). The fields attribute is (currently) a Python dictionary.
Ok. This seems to be missing in the PEP.
Yeah, actually quite a bit is missing. Because I wanted to float the idea for discussion before "getting the details perfect" (which of course they wouldn't be if it was just my input producing them).
In this code, where is PyArray_GetField coming from?
This is a NumPy Specific C-API. That's why I was confused about why you wanted me to show how I would do it.
But, what you are actually asking is how would another application use the data-type information to do the same thing using the data-type object and a pointer to memory. Is that correct?
This is a reasonable thing to request. And your example is a good one. I will use the PEP to explain it.
Ultimately, the code you are asking for will have to have some kind of dispatch table for different binary code depending on the actual data-types being shared (unless all that is needed is a copy in which case just the size of the element area can be used). In my experience, the dispatch table must be present for at least the "simple" data-types. The data-types built up from there can depend on those.
In NumPy, the data-type objects have function pointers to accomplish all the things NumPy does quickly. So, each data-type object in NumPy points to a function-pointer table and the NumPy code defers to it to actually accomplish the task (much like Python really).
Not all libraries will support working with all data-types. If they don't support it, they just raise an error indicating that it's not possible to share that kind of data.
What does it do? If I wanted to write this code from scratch, what should I write instead? Since this is all about a flat memory block, I'm surprised I need "true" Python objects for the field values in there.
Well, actually, the block could be "strided" as well.
So, you would write something that gets the pointer to the memory and then gets the extended information (dimensionality, shape, and strides, and data-format object). Then, you would get the offset of the field you are interested in from the start of the element (it's stored in the data-format representation).
Then do a memory copy from the right place (using the array iterator in NumPy you can actually do it without getting the shape and strides information first but I'm holding off on that PEP until an N-d array is proposed for Python). I'll write something like that as an example and put it in the PEP for the extended buffer protocol.
But, the other option (especially for code already written) would be to just convert the data-format specification into it's own internal representation.
Ok, so your assumption is that consumers already have their own machinery, in which case ease-of-use would be the question how difficult it is to convert datatype objects into the internal representation.