[SciPy-User] ttest_rel with unequal groups
josef.pktd at gmail.com
josef.pktd at gmail.com
Fri Nov 8 06:26:13 EST 2013
On Fri, Nov 8, 2013 at 6:04 AM, <josef.pktd at gmail.com> wrote:
>
>
>
> On Thu, Nov 7, 2013 at 10:51 PM, Horea Christian <h.chr at mail.ru> wrote:
>
>> I managed to get tteste_rel to work by replacinf my missing values
>> either with NaN or with False . I am yet to determine whether or not
>> that distorts my data (could be that d = (a - b).astype(np.float64) is
>> zero for entries where one value is false, or that false is read as
>> zero and d = (a - b).astype(np.float64) will be -b[x] wherever a[x] is
>> false...)
>>
>
> It will distort your results, since it is treated as non-missing
> observation which affects both the estimated difference and the number of
> observations, the degrees of freedom for the p-value.
>
Instead of deleting the missing observations, you can also use
stats.mstats, which does the deletion internally
>>> stats.mstats.ttest_rel(outcome[:, :1], outcome[:, 1:], axis=0)
(array([-1.60220806, -3.13556782, -7.1567637 ]), masked_array(data = [
1.25604679e-01 5.44534856e-03 8.41006537e-07],
mask = False,
fill_value = 1e+20)
>>> [stats.mstats.ttest_rel(np.ma.masked_array(outcome[:, 0]), outcome[:,
k]) for k in range(1, 4)]
[(array(-1.6022080647700057), masked_array(data = 0.125604679404,
mask = False,
fill_value = 1e+20)
), (array(-3.135567822455234), masked_array(data = 0.00544534855662,
mask = False,
fill_value = 1e+20)
), (array(-7.156763700790863), masked_array(data = 8.41006537032e-07,
mask = False,
fill_value = 1e+20)
)]
mstats.ttest_rel has the wrong axis default (None instead of 0)
and raises an exception on non masked arrays, when axis=None
Looks like a BUG.
Josef
>
>
>
>>
>> In any case, I am a bit uncertain as to the usage of this method - am I
>> supposed to pass it a 1d array? or a 2d array? I am thinking 2d shouled
>> be mandatory because otherwise the method can't tell which groups
>> measures are related. I tried doing that (my array being
>> N(participants) x N(measurements) ) but that gave me a 2d output - that
>> can't be right, I just want one t and one p value, not a multidim array.
>>
>> So, how do I use this? (The docs are not very informative on what
>> happens to 2d vs 1d inputs).
>>
>
> you need to give it two arrays, the difference between the arrays is
> calculated internally.
>
> If the arrays are 2d, then the test is calculated for each column (or
> along axis) of the broadcasted difference.
> These are separate tests for each column that give the same result as
> looping over the columns.
>
> If one array has only one column (for example a benchmark treatment), the
> other array has several columns, then we get ttest_rel for each comparison
> of a second column to the first array.
>
> The result will be as many tstatistics and pvalues as there are columns.
> There is no multiple testing correction for the pvalues
>
> >>> outcome = np.random.randn(20, 4) + [0, 0, 1, 2]
> >>> from scipy import stats
> >>> stats.ttest_rel(outcome[:, :1], outcome[:, 1:])
> (array([-1.60220806, -3.13556782, -7.1567637 ]), array([ 1.25604679e-01,
> 5.44534856e-03, 8.41006537e-07]))
>
> >>> [stats.ttest_rel(outcome[:, 0], outcome[:, k]) for k in range(1, 4)]
> [(array(-1.6022080647700057), 0.12560467940402195),
> (array(-3.135567822455234), 0.005445348556616313),
> (array(-7.156763700790868), 8.4100653703218436e-07)]
>
>
> aside: I think the following is doing the right thing for testing the
> joint hypothesis
>
> >>> diff = outcome[:, 1:] - outcome[:, :1]
> >>> stats.f_oneway(*diff.T)
> (10.606594036595835, 0.00012132595252973279)
>
>
> Josef
>
>
>
>>
>> Cheers,
>> christian
>>
>> On Do 07 Nov 2013 10:52:57 CET, Hjalmar Turesson wrote:
>> > Hi,
>> >
>> > If I'm not confused, ttest_rel is a paired samples ttest
>> > (http://en.wikipedia.org/wiki/Paired_difference_test), and thus
>> > requires that all samples are paired (this does not depend on the
>> > particular scipy implementation).
>> > If occasional samples in a group are missing, and you still want
>> > perform the paired ttest, then you will probably have to exclude the
>> > corresponding sample in the other 2nd, or generate pseudo-values to
>> > replace the missing values in the 1st group. Alternatively, you can
>> > use ttest_ind
>> > (http://en.wikipedia.org/wiki/Ttest#Independent_samples), which
>> > doesn't require exactly the same number of samples in the two groups.
>> >
>> >
>> > On Thu, Nov 7, 2013 at 2:18 AM, Horea Christian <h.chr at mail.ru
>> > <mailto:h.chr at mail.ru>> wrote:
>> >
>> > Hey there! I would like to use the ttest_rel function to compare
>> > reaction times for two conditions tested over 10 participants. We
>> have
>> > done 100 trials per participant, but some of them had errors and
>> were
>> > excluded. For instance for prticipants 1 and 2 I have condition1: 95
>> > trials, condition2: 100 trials AND condition1:100 trials and
>> > condition2:
>> > 99 trials.
>> >
>> > depending on whether or not I transpose my dataframe I get a
>> complaint
>> > either at
>> >
>> > if a.shape[axis] != b.shape[axis]:
>> > raise ValueError('unequal length arrays')
>> >
>> > or at
>> >
>> > d = (a - b).astype(np.float64)
>> >
>> > .
>> >
>> >
>> > What can I do about this? I found it surprising that it doesn't
>> "just
>> > work" since in most experiments it is expected for some of the
>> > measurements to fail.
>> >
>> > Many Thanks!
>> > Christian
>> >
>> > --
>> > Horea Christian
>> > http://chymera.eu
>> >
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User at scipy.org <mailto:SciPy-User at scipy.org>
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>> --
>> Horea Christian
>> http://chymera.eu
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20131108/8c3a7f02/attachment.html>
More information about the SciPy-User
mailing list