Simple addition to random module - Student's t
Thomas Philips
tkpmep at gmail.com
Wed Sep 2 21:31:10 CEST 2009
On Sep 2, 2:37 pm, Mark Dickinson <dicki... at gmail.com> wrote:
> On Sep 2, 6:15 pm, Thomas Philips <tkp... at gmail.com> wrote:
>
> > I mis-spoke - the variance is infinite when df=2 (the variance is df/
> > (df-2),
>
> Yes: the variance is infinite both for df=2 and df=1, and Student's t
> with df=1 doesn't even have an expectation. I don't see why this
> would stop you from generating meaningful samples, though.
>
> > and you get the Cauchy when df=2.
>
> Are you sure about this? All my statistics books are currently hiding
> in my mother-in-law's attic, several hundred miles away, but wikipedia
> and mathworld seem to say that df=1 gives you the Cauchy distribution.
>
> > I made the mistake because the denominator is equivalent to the
> > square root of the sample variance of df normal observations,
>
> As I'm reading it, the denominator is the square root of the sample
> variance of *df+1* independent standard normal observations. I agree
> that the wikipedia description is a bit confusing.
>
> It seems that there are uses for Student's t distribution with
> non-integral degrees of freedom. The Boost library, and the R
> programming language both allow non-integral degrees of freedom.
> So (as Robert Kern already suggested), you could drop the test
> for integrality of df. In fact, you could just drop the tests
> on df entirely: df <= 0.0 will be picked up in the gammavariate
> call.
>
> --
> Mark
To tell you the truth, I have never used it with a non-integer number
of degrees of freedom, but that's not the same as saying that df
should be an integer. When df is an integer, one can interpret the t-
distribution as the ratio of a unit normal (i.e. N(0,1)) to the sample
standard deviation of a set of df+1 unit normals divided by sqrt(df
+1). However, as Robert Kern correctly observes, the distribution is
defined for all positive non-integer df, though we then lose the above
interpretation, and must think of it in abstract terms. The
distribution has infinite variance when df=2 and an undefined mean
when df<=1, but the code can still be used to generate samples.
Whether or not these samples make sense is altogether another
question, but it's easy enough to remmove the restrictions.
More information about the Python-list
mailing list