[SciPy-Dev] scipy.stats: algorithm to for ticket 1493

Mon May 14 16:15:37 EDT 2012

On Mon, May 14, 2012 at 3:51 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, May 14, 2012 at 2:45 PM, nicky van foreest <vanforeest at gmail.com> wrote:
>>>> Nice example. The answer is negative, while it should be positive, but
>>>> the answer is within numerical accuracy I would say.
>>>
>>> oops, didn't we have a case with negative sign already ?
>>> maybe a check self.a <= p <= self.b  ?
>>
>> I included this. I also think that a check on whether left and right
>> stay within  self.a and self.b should be included, perhaps just for
>> safety reasons.
>>
>>>
>>>>
>>>>> I don't see anything yet to criticize in your latest version :(
>>>>
>>>> Ok. I just checked the tests in scipy/stats/tests.
>>>
>>> If you are curious, you could temporarily go closer to q=0 and q=1 in
>>> the tests for ppf, and see whether it breaks for any distribution.
>>
>> Good idea. Just to see what would happen I changed the following code
>> in test_continuous_basic.py:
>>
>> @_silence_fp_errors
>> def check_cdf_ppf(distfn,arg,msg):
>>    values = [-1.e-5, 0.,0.001,0.5,0.999,1.]
>>    npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg),
>>                            values, decimal=DECIMAL, err_msg= msg + \
>>                            ' - cdf-ppf roundtrip')
>
> roundtrip: looks like ppf should be ok, but cdf is not
>
>>>> stats.norm.ppf(-1e-5)
> nan
>>>> stats.norm.cdf(np.nan)
> 0.0
>>>> stats.norm.cdf(stats.norm.ppf(-1e-5))
> 0.0
>
> I'm using scipy 0.9. but I don't think this has changed, not that I know of
>
> I'm trying to track down when this got changed.
> (github doesn't show changes in a file that has too many changes, need
> to dig out git)

It would be better to run the same version as looking at the code.
It's difficult to find the bug or understand the behavior if it's not
there anymore

switching to scipy 0.10

>>> stats.norm.cdf(np.nan)
nan
>>> scipy.__version__
'0.10.0b2'

nan propagation is not available in 0.9.0

https://github.com/scipy/scipy/commit/96e39ecc6a2b671ed7f99a9c0375adc9238c6056#L0L1343

Josef

>
>>
>>
>> Thus, I changed the values into an array. It should fail on the first
>> value, as it is negative, but I get a pass. Specifically, I ran:
>>
>> nicky at chuck:~/prog/scipy/scipy/stats/tests$ python test_continuous_basic.py
>> ..............................................................................................................................
>> ----------------------------------------------------------------------
>> Ran 126 tests in 93.990s
>>
>> OK
>>
>>>
>>
>> Weird result. If I add a q  = 1.0000001 I get a fail on the fourth
>> test, as expected.
>>
>>>> - repair for the cases q =  0 and q = 1 by means of an explicit test.
>>>
>>> isn't ppf (generic part) taking care of this, if not then it should, I think
>>
>> Actually, from the code in lines:
>>
>> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1529
>>
>> I am inclined to believe you. However, in view of the above test ...
>> Might it be that the conditions on L1529 have been added quite
>> recently, and did not yet make it to my machine? I'll check this right
>> now....As a matter of fact, my distributions.py contains the same
>> check, i.e.,         cond1 = (q > 0) & (q < 1) . Hmmm.
>>
>> Now I admit that I do not understand in all nitty-gritty detail the
>> entire implementation of ppf(), but I suspect that this is a bug.
>>
>>>
>>> ppf(0) = self.a
>>> ppf(1) = self.b
>>
>> Good idea.
>
> this already looks correct in the generic ppf code
>
>>>> stats.beta.ppf(0, 0.5)
> 0.0
>>>> stats.beta.a
> 0.0
>
> Josef
>>
>> I'll implement the code in my branch, and do a pull request.
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev