[Numpy-discussion] Shape of join_by result is not what I expected
David Carmean
dlc at halibut.com
Wed Feb 10 19:26:04 EST 2010
On Tue, Feb 09, 2010 at 04:02:48PM -0600, Robert Kern wrote:
> numpy.lib.recfunctions.join_by(key, r1, r2, jointype='leftouter')
> * The output is sorted along the key.
> * A temporary array is formed by dropping the fields not in the key for the
> two arrays and concatenating the result. This array is then sorted, and
> the common entries selected. The output is constructed by
> filling the fields
> with the selected entries. Matching is not preserved if there are some
> duplicates...
Got this to "work", but now it's revealed my lack of understanding of the shape
of arrays; I'd hoped that the results would look like (be the same shape as?)
the column_stack results. I wanted to be able to take slices of the
results.
I created the original arrays from a list of tuples of the form
[(1265184061, 0.02), (1265184121, 0.0), (1265184181, 0.31), ]
so the resulting arrays had the shape (n,2); these seemed easy to
manipulate by slicing, and my recollection was that this was a
useful format to feed mplotlib.plot.
The result looks like:
array([ (1265184061.0, 0.0, 0.029999999999999999, 152.0, 1.5600000000000001, \
99.879999999999995, 0.02, 3.0, 0.0, 0.040000000000000001, 0.070000000000000007, \
0.68999999999999995),\
(1265184121.0, 0.0, 0.01, 148.0, 1.46, 99.950000000000003, 0.0, 0.0, 0.0, 0.01, \
0.040000000000000001, 0.56000000000000005), ] )
with shape (n,)
These 1-dimensional results give me nice text output, I can't/don't know
how to slice them; this form may work for one of my use cases, but my
main use case is to reprocess this data--which is for one server--by
taking one field from about 60 servers worth of this data (saved to disk
as binary pickles) and plot them all to a single canvas.
In other words, from sixty sets of this:
tposix ldavg-15 ldavg-1 ldavg-5
1265184061.00 0.00 0.03 1.56
1265184121.00 0.00 0.01 1.46
1265184181.00 0.00 0.65 1.37
I need to collect and plot ldavg-1 as separate time-series plots.
( perhaps I'm trying to use this stuff for a real project too early on the
learning curve? :)
Thanks for the great help so far.
More information about the NumPy-Discussion
mailing list