[SciPy-User] deterministic random variable

Mon May 3 09:35:42 EDT 2010

On Mon, May 3, 2010 at 9:16 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, May 3, 2010 at 6:04 AM, nicky van foreest <vanforeest at gmail.com> wrote:
>> Hi,
>>
>> As far as I can see scipy.stats does not support the deterministic
>> distribution. Would it be a good idea to implement this also? In my
>> opinion this distribution is very useful to use as a test case, for
>> debugging purposes for instance.
>
> You mean something like http://en.wikipedia.org/wiki/Degenerate_distribution
> (I never heard the term deterministic distribution before).
>
> If the support is an integer, then rv_discrete might work, looks good see below
>
> Are there any useful operations, that we could do with it?
> I think I can see a case for debugging programs that use the
> distributions in scipy.stats, but almost degenerate might also work
> for debugging.
>
> What I would like to have is a discrete distribution on the real line,
> instead of the integers, like rv_discrete but with support on
> arbitrary floats. This could use the machinery of rv_discrete but
> would need a generalizing rewrite.
>
>
> this looks good
>
>>>> stats.rv_discrete(values=([0],[1]), name='degenerate')
> <scipy.stats.distributions.rv_discrete object at 0x013BA5B0>
>>>> deg=stats.rv_discrete(values=([0],[1]), name='degenerate')
>>>> deg.rvs(size=10)
> array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>>>> deg.pmf(np.arange(-5,6))
> array([ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.])
>>>> deg.cdf(np.arange(-5,6))
> array([ 0.,  0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.])
>>>> deg.sf(np.arange(-5,6))
> array([ 1.,  1.,  1.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.])
>>>> deg.ppf(np.linspace(0,1,11))
> array([-1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
>>>> deg.stats()
> (array(0.0), array(0.0))
>>>> deg.stats(moments='mvsk')
> (array(0.0), array(0.0), array(-1.#IND), array(-1.#IND))
>
>
> degenerate Bernoulli has a nan problem in pmf
>
>>>> stats.bernoulli.rvs(0,size=10)
> array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>>>> stats.bernoulli.pmf(np.arange(-5,6),0.)
> array([  0.,   0.,   0.,   0.,   0.,  NaN,   0.,   0.,   0.,   0.,   0.])
>>>> stats.bernoulli.cdf(np.arange(-5,6),0.)
> array([ 0.,  0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.])
>>>> stats.bernoulli.pmf(np.arange(-5,6),1.)
> array([  0.,   0.,   0.,   0.,   0.,   0.,  NaN,   0.,   0.,   0.,   0.])
>>>> stats.bernoulli.ppf(np.linspace(0,1,11),0.)
> array([-1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.])
>>>> stats.bernoulli.ppf(np.linspace(0,1,11),1.)
> array([-1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])
>>>> stats.bernoulli.stats(0., moments='mvsk')
> (array(0.0), array(0.0), array(1.#INF), array(1.#INF))
>
>
> and almost degenerate Bernoulli
>
>>>> stats.bernoulli.pmf(np.arange(-5,6),1e-16)
> array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
>         0.00000000e+00,   0.00000000e+00,   1.00000000e+00,
>         1.00000000e-16,   0.00000000e+00,   0.00000000e+00,
>         0.00000000e+00,   0.00000000e+00])
>>>> stats.bernoulli.pmf(np.arange(-5,6),1-1e-16)
> array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
>         0.00000000e+00,   0.00000000e+00,   1.11022302e-16,
>         1.00000000e+00,   0.00000000e+00,   0.00000000e+00,
>         0.00000000e+00,   0.00000000e+00])
>>>> stats.bernoulli.ppf(np.linspace(0,1,11),1-1e-16)
> array([-1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])


for the record (and future searches)

almost degenerate normal also seems to work,
http://en.wikipedia.org/wiki/Dirac_delta_function

>>> stats.norm.rvs(loc=2.5, scale=1e-10, size=10)
array([ 2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5])
>>> stats.norm.cdf(np.linspace(2.1,2.9,11),loc=2.5, scale=1e-10)
array([ 0. ,  0. ,  0. ,  0. ,  0. ,  0.5,  1. ,  1. ,  1. ,  1. ,  1. ])
>>> stats.norm.pdf(np.linspace(2.1,2.9,11),loc=2.5, scale=1e-10)
array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
         0.00000000e+00,   0.00000000e+00,   3.98942280e+09,
         0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
         0.00000000e+00,   0.00000000e+00])
>>> stats.norm.cdf(np.linspace(2.1,2.9,11),loc=2.5, scale=1e-16)
array([ 0. ,  0. ,  0. ,  0. ,  0. ,  0.5,  1. ,  1. ,  1. ,  1. ,  1. ])
>>> stats.norm.pdf(np.linspace(2.1,2.9,11),loc=2.5, scale=1e-16)
array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
         0.00000000e+00,   0.00000000e+00,   3.98942280e+15,
         0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
         0.00000000e+00,   0.00000000e+00])
>>> stats.norm.ppf(np.linspace(0,1,11),loc=2.5, scale=1e-16)
array([-Inf,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  Inf])
>>>

Josef

>
> Josef
>
>>
>> bye
>>
>> Nicky
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>