FW: [Numpy-discussion] Questions about records
Russell E Owen
owen at astro.washington.edu
Fri Apr 2 12:09:04 EST 2004
At 2:12 PM -0500 2004-04-02, Jin-chung Hsu wrote:
>First of all, the records module was developed mainly having the 1-D table in
>mind. Even though it can have higher than one dimension, it is not thoroughly
>tested, as you have found out. However, I'd argue that in many cases that the
>need to use a 2-D (or high) table can be substituted by having an
>array in each
>cell(element). In your example, instead of creating a 2x2 table
>with each cell
>just having one number, you may be able to use a table with just one row and
>each cell is a 2x2 array. You can create such a record like this:
>
>--> arr1 = num.arange(4, shape=(1,2,2), type=num.Float64)
>--> arr2 = num.arange(4, shape=(1,2,2), type=num.Float64)+10
>--> a = rec.array([arr1, arr2], names="a,b")
But is there any advantage to that compared to just using named
arrays of the desired shape:
a = num.arange(4, shape=(2,2), type=num.Float64)
b = num.arange(4, shape=(2,2), type=num.Float64)+10
>I'd be interested in your application as to why a 2x2 table is necessary.
Here are two different uses I've come up with (both related to image
processing). Both are beautifully served by a 2-d records array:
1) Find the centroid of a star. The algorithm I'm using (invented by
Jim Gunn, I believe) is to walk across the image, looking for the
point of maximum symmetry. At each point total pixels and a measure
of asymmetry are measured in a 3x3 grid centered at that point. The
minimum asymmetry in that 3x3 array is then used to determine where
to walk next. (At the end a parabolic fit is done to the 3x3
asymmetry data to find the true centroid; up until then it's only
know to the nearest pixel).
In any case...right now I maintain two separate 3x3 arrays (total
pixels and asymmetry). Whenever I take a step I shift the both arrays
and then compute new data for the points which are missing data.
It would be cleaner and nicer to maintain one 3x3 records array with
fields "totPix" and "asymm". Then the related data sticks together
and I only have to shift the data once. (I meant to code it that way
from the start, but my early attempts to use numeric.records were a
disaster. I have a somewhat better handle on it now and may update my
code).
2) Find all stars on an image. The algorithm I'm using (invented by
Jeff Morgan, I believe) is to break an image up into blocks of, say,
5x5 pixels. I then compute information about each "super pixel", such
as center of mass, total counts, etc. My C++ code has 12 items of
information for each super pixel (including 7 boolean flags) and is
written to use a 2-dimensional array each element of which is a data
structure with the appropriate fields. The obvious python equivalent
is a numarray.records array. It sure sounds better than trying to
keep track of 12 separate arrays!
-- Russell
More information about the NumPy-Discussion
mailing list