[Numpy-discussion] Array as Variable using "from cdms2 import MV2 as MV"

Keith Goodman kwgoodman at gmail.com
Mon Apr 25 10:32:08 EDT 2011


On Mon, Apr 25, 2011 at 12:17 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, Apr 25, 2011 at 2:50 AM, dileep kunjaai <dileepkunjaai at gmail.com> wrote:
>> Dear sir,
>>
>>      I am have 2  mxn numpy array say "obs" & "fcst". I have to
>> calculate     sum of squre of (obs[i, j]-fcst[i, j]) using     from cdms2
>> import MV2 as MV   in CDAT without using "for"  loop.
>>
>> For example:
>> obs=
>> [0.6    1.1    0.02    0.2   0.2
>> 0.8    0.    0.    0.4   0.8
>> 0.5    5.5    1.5    0.5   1.5
>> 3.5    0.5    1.5    5.0   2.6
>> 5.1    4.1    3.2    2.3   1.5
>> 4.4    0.9    1.5    2.    2.3
>> 1.1    1.1    1.5    12.6  1.3
>> 2.2    12    1.7    1.6   15
>> 1.9    1.5    0.9    2.5   5.5 ]
>>
>>
>>
>> fcst=
>>
>> [0.7    0.1    0.2    0.2   0.2
>> 0.3    0.8    0.    0.    0.
>> 0.5    0.5    0.5    0.5   0.5
>> 0.7    1.     1.5    2.    2.6
>> 5.1    4.1    3.2    2.3   1.5
>> 0.7    1.    1.5    2.    2.3
>> 1.1    1.1    1.1    12.7  1.3
>> 2.2    2.    1.7    1.6   1.5
>> 1.9    1.5    0.9    0.5   7.5]
>>
>> here "obs" and "fcst" are numpy array
>> I give
>>
>> obs=MV.array(obs)
>> fcst=MV.array(fcst)
>>
>> Then it become
>>
>>
>> sbst=obs-fcst
>>
>>>> subst=
>> [[ -0.1    1.    -0.18   0.     0.  ]
>>  [  0.5   -0.8    0.     0.4    0.8 ]
>>  [  0.     5.     1.     0.     1.  ]
>>  [  2.8   -0.5    0.     3.     0.  ]
>>  [  0.     0.     0.     0.     0.  ]
>>  [  3.7   -0.1    0.     0.     0.  ]
>>  [  0.     0.     0.4   -0.1    0.  ]
>>  [  0.    10.     0.     0.    13.5 ]
>>  [  0.     0.     0.     2.    -2.  ]]
>>
>> But i dont know how to find sum of squre of each term....(Actually my aim is
>> to finding MEAN SQUARED ERROR)
>
> (sbst**2).sum()
>
> or with sum along columns
> (sbst**2).sum(0)
>
> explanation is in the documentation

If speed is an issue then the development version of bottleneck
(0.5.dev) has a fast sum of squares function:

    >> a = np.random.rand(1000, 10)
    >> timeit (a * a).sum(1)
    10000 loops, best of 3: 143 us per loop
    >> timeit bn.ss(a, 1)
    100000 loops, best of 3: 14.3 us per loop

It is a faster replacement for scipy.stats.ss:

    >> from scipy import stats
    >> timeit stats.ss(a, 1)
    10000 loops, best of 3: 159 us per loop

Speed up factor depends on the shape of the input array and the axis:

    >> timeit stats.ss(a, 0)
    10000 loops, best of 3: 42.8 us per loop
    >> timeit bn.ss(a, 0)
    100000 loops, best of 3: 17.3 us per loop



More information about the NumPy-Discussion mailing list