Travis E. Oliphant schrieb:
Or, if it does have uses independent of the buffer extension: what are those uses?
So that NumPy and ctypes and audio libraries and video libraries and database libraries and image-file format libraries can communicate about data-formats using the same expressions (in Python).
I find that puzzling. In what way can the specification of a data type enable communication? Don't you need some kind of protocol for it (i.e. operations to be invoked)? Also, do you mean that these libraries can communicate with each other? Or with somebody else? If so, with whom?
What problem do you have in defining a standard way to communicate about binary data-formats (not just images)? I still can't figure out why you are so resistant to the idea. MPI had to do it.
I'm afraid of "dead" specifications, things whose only motivation is that they look nice. They are just clutter. There are a few examples of this already in Python, like the character buffer interface or the multi-segment buffers.
As for MPI: It didn't just independently define a data types system. Instead, it did that, *and* specified the usage of the data types in operations such as MPI_SEND. It is very clear what the scope of this data description is, and what the intended usage is.
Without specifying an intended usage, it is impossible to evaluate whether the specification meets its goals.
Absolutely --- if something is to be made useful across packages and from Python. This is where the discussion should take place. The struct module and array modules would both be consumers also so that in the struct module you could specify your structure in terms of the standard data-represenation and in the array module you could specify your array in terms of the standard representation instead of using "character codes".
Ok, that would be a new usage: I expected that datatype instances always come in pairs with memory allocated and filled according to the description. If you are proposing to modify/extend the API of the struct and array modules, you should say so somewhere (in a PEP).
Suppose I wanted to change all RGB values to a gray value (i.e. R=G=B), what would the C code look like that does that? (it seems now that the primary purpose of this machinery is image manipulation)
For me it is definitely not image manipulation that is the only purpose (or even the primary purpose). It's just an easy one to explain --- most people understand images). But, I think this question is actually irrelevant (IMHO). To me, how you change all RGB values to gray would depend on the library you are using not on how data-formats are expressed.
Maybe we are still mis-understanding each other.
I expect that the primary readers/users of the PEP would be people who have to write libraries: i.e. people implementing NumPy, struct, array, and people who implement algorithms that operate on data. So usability of the specification is a matter of how easy it is to *write* a library that does perform the image manipulation.
If you really want to know. In NumPy it might look like this:
img['r'] = img['g'] img['b'] = img['g']
That's not what I'm asking. Instead, what does the NumPy code look like that gets invoked on these read-and-write operations? Does it only use the void* pointing to the start of the data, and the datatype object? If not, how would C code look like that only has the void* and the datatype object?
dtype = img->descr;
In this code, is descr a datatype object? ...
r_field = PyDict_GetItemString(dtype,'r');
... I guess not, because apparently, it is a dictionary, not a datatype object.
But, I still don't see how that is relevant to the question of how to represent the data-format to share that information across two extensions.
Well, if NumPy gets the data from a different module, it can't assume there is a descr object that is a dictionary. Instead, it must perform these operations just by using the datatype object. What else is the purpose of sharing the information, if not to use it to access the data?