Re: [SciPy-Dev] studentized range approximations

June 2, 2011

      On Thu, Jun 2, 2011 at 12:53 AM, Roger Lew <rogerlew@gmail.com> wrote:
...
Hi,
I have implemented some approximations for studentized range quantiles and
probabilities based on John R. Gleason's (1999) "An accurate, non-iterative
approximation for studentized range quantiles." Computational Statistics &
Data Analysis, (31), 147-158.
Probability approximations rely on scipy.optimize.fminbound. The functions
accept both scalars or array-like data thanks to numpy.vectorize. A fair
amount of validation and testing has been conducted on the code. More
details can be found here: http://code.google.com/p/qsturng-py/
I welcome any thoughts as to whether you all think this might be useful to
add to SciPy or make into a scikit. Any general comments would be helpful as
well. I should mention I'm a cognitive neuroscientist by trade, my use of
statistical jargon probably isn't that good.
...
From a quick look it looks very good.
What I found a bit confusing is that qstrung takes the probability of
Hi Roger,

I'm very interested in using this in scikits.statsmodels. The table
that I am currently using is very limited
http://statsmodels.sourceforge.net/devel/generated/scikits.statsmodels.sandb...

the cdf and not of the survival function. Without reading the
docstring carefully enough, I interpreted it as a p-value (upper tail)
especially since pstrung returns the upper tail probability,
...
...
...
import scikits.statsmodels.sandbox.stats.multicomp as smmc
for i in range(3, 10):
   x = qsturng(0.95, i, 16)
   x, psturng(x, i, 16), smmc.get_tukeyQcrit(i, 16, 0.05),
smmc.tukey_pvalues(x*np.ones(i), 16)[0]
(3.647864471854692, 0.049999670839029453, array(3.6499999999999999),
0.050092818925981608)
(4.0464124382823847, 0.050001178443752514, array(4.0499999999999998),
0.037164602483501508)
(4.3332505094058114, 0.049999838126148499, array(4.3300000000000001),
0.029954033157223781)
(4.5573603020371234, 0.049999276281813887, array(4.5599999999999996),
0.025276987281047769)
(4.7410585998112742, 0.049998508166777755, array(4.7400000000000002),
0.022010630154416622)
(4.8965400268915289, 0.04999983345598491, array(4.9000000000000004),
0.019614841752159107)
(5.0312039650945257, 0.049999535359310343, array(5.0300000000000002),
0.017721848279719898)

The last column is (in my interpretation) supposed to be 0.05. I was
trying to get the pvalues for Tukeys range statistic through the
multivariate t-distribution, but the unit test looks only at one point
(and I ran out of time to work on this during Christmas break). Either
there is a bug (it's still in the sandbox) or my interpretation is
wrong.

The advantage of the multivariate t-distribution is that it allows for
arbitrary correlation, but it's not a substitute for pre-calculated
tables for standard cases/distributions because it's much too slow.

------------
As a bit of background on the multiple testing, multiple comparison
status in statsmodels:

The tukeyhsd test has one test case against R, but it has too many
options (it allows unequal variances and unequal sample sizes, that
still need to be checked.)

http://statsmodels.sourceforge.net/devel/generated/scikits.statsmodels.sandb...

What I did manage to finish and verify against R

http://statsmodels.sourceforge.net/devel/generated/scikits.statsmodels.sandb...

multiple testing for general linear models is very incomplete

and as an aside: I'm not a statistician, and if the module in the
statsmodels sandbox is still a mess then it's because I took me a long
time and many functions to figure out what's going on.
----------

scipy.special has a nice collection of standard distributions
functions, but it would be very useful to have some additional
distributions either in scipy or scikits.statsmodels available, like
your studentized range statistic, (and maybe some others in multiple
comparisons, like Duncan, Dunnet) and Anderson-Darling, and ...

Thanks,

Josef
...
Regards,
Roger
Roger Lew
_______________________________________________
SciPy-Dev mailing list
SciPy-Dev@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-dev

Re: [SciPy-Dev] studentized range approximations

josef.pktd＠gmail.com