[SciPy-User] ttest_rel with unequal groups

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Nov 8 06:38:40 EST 2013


On Fri, Nov 8, 2013 at 6:26 AM, <josef.pktd at gmail.com> wrote:

>
>
>
> On Fri, Nov 8, 2013 at 6:04 AM, <josef.pktd at gmail.com> wrote:
>
>>
>>
>>
>> On Thu, Nov 7, 2013 at 10:51 PM, Horea Christian <h.chr at mail.ru> wrote:
>>
>>> I managed to get tteste_rel to work by replacinf my missing values
>>> either with NaN or with False . I am yet to determine whether or not
>>> that distorts my data (could be that d = (a - b).astype(np.float64) is
>>> zero for entries where one value is false, or that false is read as
>>> zero and d = (a - b).astype(np.float64) will be -b[x] wherever a[x] is
>>> false...)
>>>
>>
>> It will distort your results, since it is treated as non-missing
>> observation which affects both the estimated difference and the number of
>> observations, the degrees of freedom for the p-value.
>>
>
> Instead of deleting the missing observations, you can also use
> stats.mstats, which does the deletion internally
>
>
> >>> stats.mstats.ttest_rel(outcome[:, :1], outcome[:, 1:], axis=0)
> (array([-1.60220806, -3.13556782, -7.1567637 ]), masked_array(data = [
>  1.25604679e-01   5.44534856e-03   8.41006537e-07],
>              mask = False,
>        fill_value = 1e+20)
>
> >>> [stats.mstats.ttest_rel(np.ma.masked_array(outcome[:, 0]), outcome[:,
> k]) for k in range(1, 4)]
> [(array(-1.6022080647700057), masked_array(data = 0.125604679404,
>              mask = False,
>        fill_value = 1e+20)
> ), (array(-3.135567822455234), masked_array(data = 0.00544534855662,
>              mask = False,
>        fill_value = 1e+20)
> ), (array(-7.156763700790863), masked_array(data = 8.41006537032e-07,
>              mask = False,
>        fill_value = 1e+20)
> )]
>
>
> mstats.ttest_rel has the wrong axis default (None instead of 0)
>  and raises an exception on non masked arrays, when axis=None
> Looks like a BUG.
>

https://github.com/scipy/scipy/issues/3047


>
> Josef
>
>
>
>>
>>
>>
>>>
>>> In any case, I am a bit uncertain as to the usage of this method - am I
>>> supposed to pass it a 1d array? or a 2d array? I am thinking 2d shouled
>>> be mandatory because otherwise the method can't tell which groups
>>> measures are related. I tried doing that (my array being
>>> N(participants) x N(measurements) ) but that gave me a 2d output - that
>>> can't be right, I just want one t and one p value, not a multidim array.
>>>
>>> So, how do I use this? (The docs are not very informative on what
>>> happens to 2d vs 1d inputs).
>>>
>>
>> you need to give it two arrays, the difference between the arrays is
>> calculated internally.
>>
>> If the arrays are 2d, then the test is calculated for each column (or
>> along axis) of the broadcasted difference.
>> These are separate tests for each column that give the same result as
>> looping over the columns.
>>
>> If one array has only one column (for example a benchmark treatment), the
>> other array has several columns, then we get ttest_rel for each comparison
>> of a second column to the first array.
>>
>> The result will be as many tstatistics and pvalues as there are columns.
>> There is no multiple testing correction for the pvalues
>>
>> >>> outcome = np.random.randn(20, 4) + [0, 0, 1, 2]
>> >>> from scipy import stats
>> >>> stats.ttest_rel(outcome[:, :1], outcome[:, 1:])
>> (array([-1.60220806, -3.13556782, -7.1567637 ]), array([  1.25604679e-01,
>>   5.44534856e-03,   8.41006537e-07]))
>>
>> >>> [stats.ttest_rel(outcome[:, 0], outcome[:, k]) for k in range(1, 4)]
>> [(array(-1.6022080647700057), 0.12560467940402195),
>> (array(-3.135567822455234), 0.005445348556616313),
>> (array(-7.156763700790868), 8.4100653703218436e-07)]
>>
>>
>> aside: I think the following is doing the right thing for testing the
>> joint hypothesis
>>
>> >>> diff = outcome[:, 1:] - outcome[:, :1]
>> >>> stats.f_oneway(*diff.T)
>> (10.606594036595835, 0.00012132595252973279)
>>
>>
>> Josef
>>
>>
>>
>>>
>>> Cheers,
>>> christian
>>>
>>> On Do 07 Nov 2013 10:52:57 CET, Hjalmar Turesson wrote:
>>> > Hi,
>>> >
>>> > If I'm not confused, ttest_rel is a paired samples ttest
>>> > (http://en.wikipedia.org/wiki/Paired_difference_test), and thus
>>> > requires that all samples are paired (this does not depend on the
>>> > particular scipy implementation).
>>> > If occasional samples in a group are missing, and you still want
>>> > perform the paired ttest, then you will probably have to exclude the
>>> > corresponding sample in the other 2nd, or generate pseudo-values to
>>> > replace the missing values in the 1st group. Alternatively, you can
>>> > use ttest_ind
>>> > (http://en.wikipedia.org/wiki/Ttest#Independent_samples), which
>>> > doesn't require exactly the same number of samples in the two groups.
>>> >
>>> >
>>> > On Thu, Nov 7, 2013 at 2:18 AM, Horea Christian <h.chr at mail.ru
>>> > <mailto:h.chr at mail.ru>> wrote:
>>> >
>>> >     Hey there! I would like to use the ttest_rel function to compare
>>> >     reaction times for two conditions tested over 10 participants. We
>>> have
>>> >     done 100 trials per participant, but some of them had errors and
>>> were
>>> >     excluded. For instance for prticipants 1 and 2 I have condition1:
>>> 95
>>> >     trials, condition2: 100 trials AND condition1:100 trials and
>>> >     condition2:
>>> >     99 trials.
>>> >
>>> >     depending on whether or not I transpose my dataframe I get a
>>> complaint
>>> >     either at
>>> >
>>> >          if a.shape[axis] != b.shape[axis]:
>>> >              raise ValueError('unequal length arrays')
>>> >
>>> >     or at
>>> >
>>> >          d = (a - b).astype(np.float64)
>>> >
>>> >     .
>>> >
>>> >
>>> >     What can I do about this? I found it surprising that it doesn't
>>> "just
>>> >     work" since in most experiments it is expected for some of the
>>> >     measurements to fail.
>>> >
>>> >     Many Thanks!
>>> >     Christian
>>> >
>>> >     --
>>> >     Horea Christian
>>> >     http://chymera.eu
>>> >
>>> >     _______________________________________________
>>> >     SciPy-User mailing list
>>> >     SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>>> >     http://mail.scipy.org/mailman/listinfo/scipy-user
>>> >
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > SciPy-User mailing list
>>> > SciPy-User at scipy.org
>>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>> --
>>> Horea Christian
>>> http://chymera.eu
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20131108/a77f8e31/attachment.html>


More information about the SciPy-User mailing list