[Numpy-discussion] Outer join ?

Robert Kern robert.kern at gmail.com
Thu Feb 12 00:40:59 EST 2009


On Wed, Feb 11, 2009 at 23:24, A B <python6009 at gmail.com> wrote:
> Hi,
>
> I have the following data structure:
>
> col1 | col2 | col3
>
> 20080101|key1|4
> 20080201|key1|6
> 20080301|key1|5
> 20080301|key2|3.4
> 20080601|key2|5.6
>
> For each key in the second column, I would like to create an array
> where for all unique values in the first column, there will be either
> a value or zero if there is no data available. Like so:
>
> # 20080101, 20080201, 20080301, 20080601
>
> key1 - 4, 6, 5,    0
> key2 - 0, 0, 3.4, 5.6
>
> Ideally, the results would end up in a 2d array.
>
> What's the most efficient way to accomplish this? Currently, I am
> getting a list of uniq col1 and col2 values into separate variables,
> then looping through each unique value in col2
>
> a = loadtxt(...)
>
> dates = unique(a[:]['col1'])
> keys = unique(a[:]['col2'])
>
> for key in keys:
>    b = a[where(a[:]['col2'] == key)]
>    ???

Take a look at setmember1d().

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list