[SciPy-User] MLE with stats.lognorm

Mon Oct 10 19:35:54 EDT 2011

On Mon, Oct 10, 2011 at 3:22 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Oct 10, 2011 at 10:26 AM, Christian K. <ckkart at hoc.net> wrote:
>>> >> for example with starting value for loc
>>> >>>>> print stats.lognorm.fit(x, loc=0)
>>> >> (0.23800805074491538, 0.034900026034516723, 196.31113801786194)
>>> >
>>> > I see. Is there any workaround/patch to force loc=0.0? What is the
>>> > meaning of loc anyway?
>>>
>>> loc is the starting value for fmin, I don't remember how to specify
>>> starting values for shape parameters, I never used it.
>>>
>>> As in the ticket you could monkey patch the _fitstart function
>>>
>>> >>> stats.cauchy._fitstart = lambda x:(0,1)
>>> >>> stats.cauchy.fit(x)
>>>
>>> or what I do to experiment with starting values is
>>>
>>> stats.distributions.lognorm_gen._fitstart = fitstart_lognormal
>>
>> Ok, but this is not different from calling fit like
>> stats.lognorm.fit(samples, loc=0.0)
>>
>> I would really need to force loc=0.0
>>
>> stats.lognorm.fit(samples, loc=0.0, floc=0.0)
>>
>> does not work either.
>
> ok, I misunderstood that you want to fix the location parameter at zero
>
> This looks like a different bug.
>
> floc=0 doesn't seem to work, I don't get any results that look close
> to the true values
> With a sample size of 2000 the MLE should be pretty close to the true
> parameters:

this is now http://projects.scipy.org/scipy/ticket/1536

I ran a few more distributions as examples, and my conclusion is: At
this stage, don't trust any results with setting floc.

As far as I know, nobody has ever checked the fixed parameter cases in
distributions fit. Patches welcome.

Josef

>
>
> import numpy as np
>
> from scipy import stats
>
> np.set_printoptions(precision=4)
> print 'true'
> print 0.25, 0., 20.0
> print 'estimated, floc=0, loc=0'
> for i in range(10):
>    x = stats.lognorm.rvs(0.25, 0., 20.0, size=2000)
>    print np.array(stats.lognorm.fit(x, floc=0)), \
>            np.array(stats.lognorm.fit(x, loc=0))
>
> true
> 0.25 0.0 20.0
> estimated, floc=0, loc=0
> [ 2.1271  0.      2.3999] [  0.2623   1.0211  18.7911]
> [ 2.1393  0.      2.3952] [  0.2523   0.0294  20.0117]
> [ 2.1356  0.      2.3978] [  0.2477   0.03    19.9703]
> [ 2.1378  0.      2.3874] [  0.2496   0.0301  19.9231]
> [ 2.1463  0.      2.3641] [  0.2474   0.0292  19.9051]
> [ 2.1408  0.      2.3898] [  0.2459   0.0303  20.0118]
> [ 2.1252  0.      2.4326] [  0.251    0.029   20.0412]
> [ 2.1296  0.      2.3943] [  0.2476   0.0296  19.8208]
> [ 2.1344  0.      2.401 ] [  0.2472   0.0299  19.9744]
> [ 2.1383  0.      2.4133] [  0.247    0.0301  20.1544]
>
> floc=0 is supposed to fix the location at 0, loc=0 only provides a
> starting value for loc, but still estimates loc
>
>>
>> Btw., I think the extradoc is quite misleading:
>
> I think this might be just the non-standard parameterization of the
> log-normal distribution because we use generic loc and scale handling.
> The parameterization has been discussed in the mailing list and for
> example in http://projects.scipy.org/scipy/ticket/1502
>
> clearer documentation for this or a reparameterized distribution would
> be helpful for lognorm
>
> Josef
>
>>
>> """
>> lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2)
>> for x > 0, s > 0.
>>
>> If log x is normally distributed with mean mu and variance sigma**2,
>> then x is log-normally distributed with shape paramter sigma and scale
>> parameter exp(mu).
>> """
>>
>> sigma seems to equal s in the function definition but mu does not appear at
>> all. It seems to enter via _pdf()/scale when looking at distributions.py,
>> wehere scale = exp(mu)?
>>
>> Christian
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>