On 01/30/2018 01:33 PM, email@example.com wrote:
AFAICS, one problem is that the padded view didn't come with the matching down stream usage support, the pack function as mentioned, an alternative way to convert to a standard ndarray, copy doesn't get rid of the padding and so on.
eg. another mailing list thread I just found with the same problem http://numpy-discussion.10968.n7.nabble.com/view-of-recarray-issue-td32001.h...
quoting Ralf: Question: is that really the recommended way to get an (N, 2) size float array from two columns of a larger record array? If so, why isn't there a better way? If you'd want to write to that (N, 2) array you have to append a copy, making it even uglier. Also, then there really should be tests for views in test_records.py.
This "better way" never showed up, AFAIK. And it looks like we came back to this problem every few years.
Since we are at least pushing off this change to a later release (1.15?), we have some time to prepare/catch up.
What can we add to numpy.lib.recfunctions to make the multi-field copy->view change smoother? We have discussed at least two functions:
* repack_fields - rearrange the memory layout of a structured array to add/remove padding between fields
* structured_to_unstructured - turns a n-D structured array into an (n+1)-D unstructured ndarray, whose dtype is the highest common type of all the fields. May want the inverse function too.
We might also consider
* apply_along_fields(arr, method) - applies the method along the "field" axis, equivalent to something like method(struct_to_unstructured(arr), axis=-1)
I think these are pretty minimal and shouldn't be too hard to implement.