[Numpy-discussion] record arrays initialization
Moroney, Catherine M (388D)
Catherine.M.Moroney at jpl.nasa.gov
Wed May 2 17:45:44 EDT 2012
Thanks to Perry for some very useful off-list conversation. I realize that
I wasn't being clear at all in my earlier description of the problem so here it is
in a nutshell:
Find the best match in an array t(5000, 7) for a single vector e(7). Now scale
it up so e is (128, 512, 7) and I want to return a (128, 512) array of the t-identifiers
that are the best match for e. "Best match" is defined as the minimum Euclidean distance.
I'm going to try three ways: (a) brute force and lots of looping in python,
(b) constructing a function to find the match for a single instance of e and
vectorizing it, and (c) coding it in Fortran. I'll be curious to see the
performance figures.
Two smaller questions:
A) How do I most efficiently construct a record array from a single array?
I want to do the following, but it segfaults on me when i try to print b.
vtype = [("x", numpy.ndarray)]
a = numpy.arange(0, 16).reshape(4,4)
b = numpy.recarray((4), dtype=vtype, buf=a)
print a
print b
What is the most efficient way of constructing b from the values of a? In real-life,
a is (128*512*7) and I want b to be (128, 512) with the x component being a 7-value numpy array.
and
B) If I'm vectorizing a function ("single") to find the best match for
a single element of e within t, how do I pass the entire array t into
the function without having it parcelled down to its individual elements?
i.e.
def single(elements, targets):
nlen = element.shape[0]
nvec = targets.data.shape[0]
x = element.reshape(1, nlen).repeat(nvec, axis=0)
diffs = ((x - targets.data)**2).sum(axis=1)
diffs = numpy.sqrt(diffs)
return numpy.argmin(diffs, axis=0)
multiple = numpy.vectorize(single)
x = multiple(all_elements, target)
where all_elements is similar to "b" in my first example, and target
is a 2-d array. The above code doesn't work because "target" gets reduced
to a single element when it gets down to "single" and I need to see the whole array
when I'm down in "single".
I found a work-around by encapsulating target into a single object and passing
in the object, but I'm curious if there's a better way of doing this.
I hope I've explained myself better this time around,
Catherine
More information about the NumPy-Discussion
mailing list