
I just started working on a time-series module/class in scipy/numpy and it seemed useful to have some of the R data-frame functionality (i.e., select columns of data based on variable names). I tried rec-arrays but couldn't get them to work the way I wanted. I also looked at the Dataframe class by Andrew Straw but at over 400 lines of code that seemed pretty complicated, to me at least. I searched the mailing-list archives and found a discussion on 'Table like array' (see exert below). To get the minimal functionality discussed, I wrote a simple class (see attached) to try and implement X.get('a','c') where 'a' and 'c' are variables names linked to columns of data in X. I added some test code so that if you run the code in the attachment you will see that is seems to work. However, since this is my first class I'd appreciate your input on the approach I used and any suggestions on how to improve the class (or use something else). I'd like to read the data and variable names directly from a single csv file. I tried this through the python csv module but it would read all data as strings and I couldn't figure out how to easily separate the variable names and the data. Thanks, Vincent
[Numpy-discussion] Re: [SciPy-user] Table like array Paul Barrett pebarrett at gmail.com Wed Mar 1 06:45:02 CST 2006
On 3/1/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
How many people would like to see x['f1','f2','f5'] return a new array with a new data-type descriptor constructed from the provided fields?
I'm surprised that it's not already available.
-- Paul