[Numpy-discussion] Setting custom dtypes and 1.14

Tue Jan 30 15:21:20 EST 2018

On 01/30/2018 01:33 PM, josef.pktd at gmail.com wrote:
> AFAICS, one problem is that the padded view didn't come with the
> matching down stream usage support, the pack function as mentioned, an
> alternative way to convert to a standard ndarray, copy doesn't get rid
> of the padding and so on.
> 
> eg. another mailing list thread I just found with the same problem
> http://numpy-discussion.10968.n7.nabble.com/view-of-recarray-issue-td32001.html
> 
> quoting Ralf:
> Question: is that really the recommended way to get an (N, 2) size float
> array from two columns of a larger record array? If so, why isn't there
> a better way? If you'd want to write to that (N, 2) array you have to
> append a copy, making it even uglier. Also, then there really should be
> tests for views in test_records.py.
> 
> 
> This "better way" never showed up, AFAIK. And it looks like we came back
> to this problem every few years.
> 
> Josef

Since we are at least pushing off this change to a later release
(1.15?), we have some time to prepare/catch up.

What can we add to numpy.lib.recfunctions to make the multi-field
copy->view change smoother? We have discussed at least two functions:

 * repack_fields - rearrange the memory layout of a structured array to
add/remove padding between fields

 * structured_to_unstructured - turns a n-D structured array into an
(n+1)-D unstructured ndarray, whose dtype is the highest common type of
all the fields. May want the inverse function too.

We might also consider

 * apply_along_fields(arr, method) - applies the method along the
"field" axis, equivalent to something like
method(struct_to_unstructured(arr), axis=-1)

I think these are pretty minimal and shouldn't be too hard to implement.

Allan