[Numpy-discussion] Setting custom dtypes and 1.14

Eric Wieser wieser.eric+numpy at gmail.com
Mon Jan 29 13:22:12 EST 2018


I think that there's a lot of confusion going around about recarrays vs
structured arrays.

[`recarray`](
https://github.com/numpy/numpy/blob/v1.13.0/numpy/core/records.py) are a
wrapper around structured arrays that provide:
* Attribute access to fields as `arr.field` in addition to the normal
`arr['field']`
* Automatic datatype-guessing for nested lists of tuples (which needs a
little work, but seems like a justifiable feature)
* An undocumented `field` method that behaves like the 1.14 indexing
behavior (!)

Meanwhile, `recfunctions` is a collection of functions that work on normal
structured arrays - so is misleadingly named.
The only link to recarrays is that most of the functions have a
`asrecarray` parameter which applies `.view(recarray)` to the result.

> deprecate recarrays

Given how thin an abstraction they are over structured arrays, I don't
think you mean this.
Are you advocating for deprecating structured arrays entirely, or just
deprecating recfunctions?

Eric

On Mon, 29 Jan 2018 at 09:39 Chris Barker <chris.barker at noaa.gov> wrote:

> On Sat, Jan 27, 2018 at 8:50 PM, Allan Haldane <allanhaldane at gmail.com>
> wrote:
>
>> On 01/26/2018 06:01 PM, josef.pktd at gmail.com wrote:
>>
>>>     I thought recarrays were pretty cool back in the day, but pandas is
>>>     a much better option.
>>>
>>>     So I pretty much only use structured arrays for data exchange with C
>>>     code....
>>>
>>> My impression is that this turns into a deprecate recarrays and
>>> supporting recfunction issue.
>>>
>>>
>
>> *should* we have any dataframe-like functionality in numpy?
>
>
>>
>> We get requests every once in a while about how to sort rows, or about
>> adding a "groupby" function. I myself have used recarrays in a
>> dataframe-like way, when I wanted a quick multiple-array object that
>> supported numpy indexing. So there is some demand to have minimal
>> "dataframe-like" behavior in numpy itself.
>>
>> recarrays play part of this role currently, though imperfectly due to
>> padding and cache issues. I think I'm comfortable with supporting some
>> minor use of structured/recarrays as dataframe-like, with a warning in docs
>> that the user should really look at pandas/xarray, and that structured
>> arrays are primarily for data exchange.
>>
>
> Well, I think we should either:
>
> deprecate recarrays -- i.e. explicitly not support DataFrame-like
> functionality in numpy, keeping only the data-exchange functionality as
> maintained.
>
> or
>
> Properly support it -- which doesn't mean re-implementing Pandas or
> xarray, but it would mean addressing any bug-like issues like not dealing
> properly with padding.
>
> Personally, I don't need/want it enough to contribute, but if someone
> does, great.
>
> This reminds me a bit of the old numpy.Matrix issue -- it was ALMOST
> there, but not quite, with issues, and there was essentially no overlap
> between the people that wanted it and the people that had the time and
> skills to really make it work.
>
> (If we want to dream, maybe one day we should make a minimal
>> multiple-array container class. I imagine it would look pretty similar to
>> recarray, but stored as a set of arrays instead of a structured array. But
>> maybe recarrays are good enough, and let's not reimplement pandas either.)
>>
>
> Exactly -- we really don't need to re-implement Pandas....
>
> (except it's CSV reading capability :-) )
>
> -CHB
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180129/41e5b12f/attachment-0001.html>


More information about the NumPy-Discussion mailing list