proposed change to recarray access

Hello all, I've submitted a pull request on github which changes how string values in recarrays are returned, which may break old code. https://github.com/numpy/numpy/pull/5454 See also: https://github.com/numpy/numpy/issues/3993 Previously, recarray fields of type 'S' or 'U' (ie, strings) would be returned as chararrays when accessed by attribute, but ndarrays when accessed by indexing: >>> arr = np.array([('abc ', 1), ('abc', 2)], dtype=[('str', 'S4'), ('id', int)]) >>> arr = arr.view(np.recarray) >>> type(arr.str) numpy.core.defchararray.chararray >>> type(arr['str']) numpy.core.records.recarray Chararray is deprecated, and furthermore this led to bugs in my code since chararrays trim trailing whitespace but but ndarrays do not (and I was not aware of conversion to chararray). For example: >>> arr.str[0] == arr.str[1] True >>> arr['str'][0] == arr['str'][1] False In the pull request I have changed recarray attribute access so ndarrays are always returned. I think this is a sensible thing to do but it may break code which depends on chararray features (including the trimmed whitespace). Does this sound reasonable? Best, Allan
participants (1)
-
Allan Haldane