[Numpy-discussion] Shape of join_by result is not what I expected

David Carmean dlc at halibut.com
Wed Feb 10 19:26:04 EST 2010


On Tue, Feb 09, 2010 at 04:02:48PM -0600, Robert Kern wrote:

> numpy.lib.recfunctions.join_by(key, r1, r2, jointype='leftouter')

>     * The output is sorted along the key.
>     * A temporary array is formed by dropping the fields not in the key for the
>       two arrays and concatenating the result. This array is then sorted, and
>       the common entries selected. The output is constructed by
> filling the fields
>       with the selected entries. Matching is not preserved if there are some
>       duplicates...

Got this to "work", but now it's revealed my lack of understanding of the shape 
of arrays;  I'd hoped that the results would look like (be the same shape as?) 
the column_stack results.  I wanted to be able to take slices of the 
results.   

I created the original arrays from a list of tuples of the form

    [(1265184061, 0.02), (1265184121, 0.0), (1265184181, 0.31), ]

so the resulting arrays had the shape (n,2); these seemed easy to 
manipulate by slicing, and my recollection was that this was a 
useful format to feed mplotlib.plot.

The result looks like:

  array([ (1265184061.0, 0.0, 0.029999999999999999, 152.0, 1.5600000000000001, \
    99.879999999999995, 0.02, 3.0, 0.0, 0.040000000000000001, 0.070000000000000007, \
       0.68999999999999995),\
    (1265184121.0, 0.0, 0.01, 148.0, 1.46, 99.950000000000003, 0.0, 0.0, 0.0, 0.01, \
	0.040000000000000001, 0.56000000000000005), ] )

with shape (n,)

These 1-dimensional results give me nice text output, I can't/don't know
how to slice them;  this form may work for one of my use cases, but my
main use case is to reprocess this data--which is for one server--by
taking one field from about 60 servers worth of this data (saved to disk
as binary pickles) and plot them all to a single canvas.

In other words, from sixty sets of this:

  tposix  	ldavg-15  ldavg-1  ldavg-5
  1265184061.00    0.00   0.03    1.56
  1265184121.00    0.00   0.01    1.46
  1265184181.00    0.00   0.65    1.37

I need to collect and plot ldavg-1 as separate time-series plots.

( perhaps I'm trying to use this stuff for a real project too early on the 
learning curve?  :)

Thanks for the great help so far.








More information about the NumPy-Discussion mailing list