FW: [Numpy-discussion] Questions about records

Fri Apr 2 12:09:04 EST 2004

At 2:12 PM -0500 2004-04-02, Jin-chung Hsu wrote:
>First of all, the records module was developed mainly having the 1-D table in
>mind.  Even though it can have higher than one dimension, it is not thoroughly
>tested, as you have found out.  However, I'd argue that in many cases that the
>need to use a 2-D (or high) table can be substituted by having an 
>array in each
>cell(element).  In your example, instead of creating a 2x2 table 
>with each cell
>just having one number, you may be able to use a table with just one row and
>each cell is a 2x2 array.  You can create such a record like this:
>
>--> arr1 = num.arange(4, shape=(1,2,2), type=num.Float64)
>--> arr2 = num.arange(4, shape=(1,2,2), type=num.Float64)+10
>--> a = rec.array([arr1, arr2], names="a,b")

But is there any advantage to that compared to just using named 
arrays of the desired shape:
a = num.arange(4, shape=(2,2), type=num.Float64)
b = num.arange(4, shape=(2,2), type=num.Float64)+10

>I'd be interested in your application as to why a 2x2 table is necessary.

Here are two different uses I've come up with (both related to image 
processing). Both are beautifully served by a 2-d records array:

1) Find the centroid of a star. The algorithm I'm using (invented by 
Jim Gunn, I believe) is to walk across the image, looking for the 
point of maximum symmetry. At each point total pixels and a measure 
of asymmetry are measured in a 3x3 grid centered at that point. The 
minimum asymmetry in that 3x3 array is then used to determine where 
to walk next. (At the end a parabolic fit is done to the 3x3 
asymmetry data to find the true centroid; up until then it's only 
know to the nearest pixel).

In any case...right now I maintain two separate 3x3 arrays (total 
pixels and asymmetry). Whenever I take a step I shift the both arrays 
and then compute new data for the points which are missing data.

It would be cleaner and nicer to maintain one 3x3 records array with 
fields "totPix" and "asymm". Then the related data sticks together 
and I only have to shift the data once. (I meant to code it that way 
from the start, but my early attempts to use numeric.records were a 
disaster. I have a somewhat better handle on it now and may update my 
code).

2) Find all stars on an image. The algorithm I'm using (invented by 
Jeff Morgan, I believe) is to break an image up into blocks of, say, 
5x5 pixels. I then compute information about each "super pixel", such 
as center of mass, total counts, etc. My C++ code has 12 items of 
information for each super pixel (including 7 boolean flags) and is 
written to use a 2-dimensional array each element of which is a data 
structure with the appropriate fields. The obvious python equivalent 
is a numarray.records array. It sure sounds better than trying to 
keep track of 12 separate arrays!

-- Russell