[MATRIX-SIG] Proposal: "Tables" in NumPy

Jim Phillips jim@ks.uiuc.edu
Fri, 25 Jul 97 11:08:33 -0500


Hoon Yoon wrote:
> I am attempting to incorporate hash('STR') to an Array. Since hash turns
> a String into uniq numeric sequence, I am trying to use it to sneak in

The has function does not return a unique number for every string.  This  
would be impossible!  A hash function returns an integer given some data, but  
collisions do occur and have to be dealt with.  There is definitely not a 1-1  
mapping between strings and hash values (there are fewer 10 digit numbers than  
10 character strings).

As I see it, there are two possibilities.  You can store the strings  
separately in a list, dictionary, array, etc. and define your own id's based  
on that, or you can write your own function to map (short!) strings uniquely  
to integers.

I do sympathize with this problem.  I would really love to see something  
higher level than an array, perhaps called a "table", which could be indexed  
as a dictionary (columns by heading) in some dimensions, and as an array (rows  
by index) in others.  This would really simplify data storage since one could  
label data and access it without worrying about which column "pressure" is.   
Pickled tables would also be self documenting!

I have written something like this in python, but only in two dimensions and  
it is rather slow.  I would love to see a full N-dimension implementation as a  
part of NumPy.  Perhaps a good interface would look something like this:

>>> a  # a 2-D table
dogs   cats    birds
 2      4       6
 1      2       3
 9      8       7
>>> a['dogs']  # returns array (reference!)
2  1  9
>>> a[:,1]  # returns table (reference!)
dogs   cats    birds
 1      2       3
>>> a[('dogs','birds'),0:2]  # returns table (reference!)
dogs   birds
 2      6
 1      3

This could really be a killer feature for data analysis since we would no  
longer have to choose between C-style structures and FORTRAN-style arrays!

-Jim Phillips

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________