specify lognormal distribution with mu and sigma using scipy.stats
Hello list, I am having trouble creating a lognormal distribution with known mean mu and standard deviation sigma using scipy.stats According to the docs, the programmed function is: lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) So s is the standard deviation. But how do I specify the mean? I found some information that when you specify loc and scale, you replace x by (x-loc)/scale But in the lognormal distribution, you want to replace log(x) by log(x)-loc where loc is mu. How do I do that? In addition, would it be a good idea to create some convenience functions that allow you to simply create lognormal (and maybe normal) distributions by specifying the more common mu and sigma? That would surely make things more userfriendly. Thanks, Mark
On Wed, Oct 14, 2009 at 4:22 AM, Mark Bakker <markbak@gmail.com> wrote:
Hello list, I am having trouble creating a lognormal distribution with known mean mu and standard deviation sigma using scipy.stats According to the docs, the programmed function is: lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) So s is the standard deviation. But how do I specify the mean? I found some information that when you specify loc and scale, you replace x by (x-loc)/scale But in the lognormal distribution, you want to replace log(x) by log(x)-loc where loc is mu. How do I do that? In addition, would it be a good idea to create some convenience functions that allow you to simply create lognormal (and maybe normal) distributions by specifying the more common mu and sigma? That would surely make things more userfriendly. Thanks, Mark
I don't think loc of lognorm makes much sense in most application, since it is just shifting the support, lower boundary is zero+loc. The loc of the underlying normal distribution enters through the scale. see also http://en.wikipedia.org/wiki/Log-normal_distribution#Mean_and_standard_devia...
print stats.lognorm.extradoc
Lognormal distribution lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) for x > 0, s > 0. If log x is normally distributed with mean mu and variance sigma**2, then x is log-normally distributed with shape paramter sigma and scale parameter exp(mu). roundtrip with mean mu of the underlying normal distribution (scale=1):
mu=np.arange(5) np.log(stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0])-0.5 array([ 0., 1., 2., 3., 4.])
corresponding means of lognormal distribution
stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0] array([ 1.64872127, 4.48168907, 12.18249396, 33.11545196, 90.0171313 ])
shifting support:
stats.lognorm.a 0.0 stats.lognorm.ppf([0, 0.5, 1], 1, loc=3,scale=1) array([ 3., 4., Inf])
The only case that I know for lognormal is in regression, so I'm not sure what you mean by the convenience functions. (the normal distribution is defined by loc=mean, scale=standard deviation) assume the regression equation is y = x*beta*exp(u) u distributed normal(0, sigma^2) this implies ln y = ln(x*beta) + u which is just a standard linear regression equation which can be estimated by ols or mle exp(u) in this case is lognormal distributed Josef
On Wed, Oct 14, 2009 at 9:20 AM, <josef.pktd@gmail.com> wrote:
On Wed, Oct 14, 2009 at 4:22 AM, Mark Bakker <markbak@gmail.com> wrote:
Hello list, I am having trouble creating a lognormal distribution with known mean mu and standard deviation sigma using scipy.stats According to the docs, the programmed function is: lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) So s is the standard deviation. But how do I specify the mean? I found some information that when you specify loc and scale, you replace x by (x-loc)/scale But in the lognormal distribution, you want to replace log(x) by log(x)-loc where loc is mu. How do I do that? In addition, would it be a good idea to create some convenience functions that allow you to simply create lognormal (and maybe normal) distributions by specifying the more common mu and sigma? That would surely make things more userfriendly. Thanks, Mark
I don't think loc of lognorm makes much sense in most application, since it is just shifting the support, lower boundary is zero+loc. The loc of the underlying normal distribution enters through the scale.
see also http://en.wikipedia.org/wiki/Log-normal_distribution#Mean_and_standard_devia...
print stats.lognorm.extradoc
Lognormal distribution
lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) for x > 0, s > 0.
If log x is normally distributed with mean mu and variance sigma**2, then x is log-normally distributed with shape paramter sigma and scale parameter exp(mu).
roundtrip with mean mu of the underlying normal distribution (scale=1):
mu=np.arange(5) np.log(stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0])-0.5 array([ 0., 1., 2., 3., 4.])
corresponding means of lognormal distribution
stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0] array([ 1.64872127, 4.48168907, 12.18249396, 33.11545196, 90.0171313 ])
shifting support:
stats.lognorm.a 0.0 stats.lognorm.ppf([0, 0.5, 1], 1, loc=3,scale=1) array([ 3., 4., Inf])
The only case that I know for lognormal is in regression, so I'm not sure what you mean by the convenience functions. (the normal distribution is defined by loc=mean, scale=standard deviation)
assume the regression equation is y = x*beta*exp(u) u distributed normal(0, sigma^2) this implies ln y = ln(x*beta) + u which is just a standard linear regression equation which can be estimated by ols or mle
I think, I don't remember this part correctly, I just realized that the regression equation would be non-linear in parameters. Josef
exp(u) in this case is lognormal distributed
Josef
Hello, I'm also having difficulties with lognorm. If mu is the mean and s**2 is the variance then...
from scipy.stats import lognorm from math import exp mu = 10 s = 1 d = lognorm(s, scale=exp(mu)) d.stats('m') array(36315.502674246643)
shouldn't that be 10? On Wed, Oct 14, 2009 at 3:20 PM, <josef.pktd@gmail.com> wrote:
On Wed, Oct 14, 2009 at 4:22 AM, Mark Bakker <markbak@gmail.com> wrote:
Hello list, I am having trouble creating a lognormal distribution with known mean mu and standard deviation sigma using scipy.stats According to the docs, the programmed function is: lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) So s is the standard deviation. But how do I specify the mean? I found some information that when you specify loc and scale, you replace x by (x-loc)/scale But in the lognormal distribution, you want to replace log(x) by log(x)-loc where loc is mu. How do I do that? In addition, would it be a good idea to create some convenience functions that allow you to simply create lognormal (and maybe normal) distributions by specifying the more common mu and sigma? That would surely make things more userfriendly. Thanks, Mark
I don't think loc of lognorm makes much sense in most application, since it is just shifting the support, lower boundary is zero+loc. The loc of the underlying normal distribution enters through the scale.
see also http://en.wikipedia.org/wiki/Log-normal_distribution#Mean_and_standard_devia...
print stats.lognorm.extradoc
Lognormal distribution
lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) for x > 0, s > 0.
If log x is normally distributed with mean mu and variance sigma**2, then x is log-normally distributed with shape paramter sigma and scale parameter exp(mu).
roundtrip with mean mu of the underlying normal distribution (scale=1):
mu=np.arange(5) np.log(stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0])-0.5 array([ 0., 1., 2., 3., 4.])
corresponding means of lognormal distribution
stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0] array([ 1.64872127, 4.48168907, 12.18249396, 33.11545196, 90.0171313 ])
shifting support:
stats.lognorm.a 0.0 stats.lognorm.ppf([0, 0.5, 1], 1, loc=3,scale=1) array([ 3., 4., Inf])
The only case that I know for lognormal is in regression, so I'm not sure what you mean by the convenience functions. (the normal distribution is defined by loc=mean, scale=standard deviation)
assume the regression equation is y = x*beta*exp(u) u distributed normal(0, sigma^2) this implies ln y = ln(x*beta) + u which is just a standard linear regression equation which can be estimated by ols or mle
exp(u) in this case is lognormal distributed
Josef _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Armando Serrano Lombillo wrote:
Hello, I'm also having difficulties with lognorm.
If mu is the mean and s**2 is the variance then...
from scipy.stats import lognorm from math import exp mu = 10 s = 1 d = lognorm(s, scale=exp(mu)) d.stats('m') array(36315.502674246643)
shouldn't that be 10?
In terms of mu and sigma, the mean of the lognormal distribution is exp(mu + 0.5*sigma**2). In your example: In [16]: exp(10.5) Out[16]: 36315.502674246636 Warren
On Wed, Oct 14, 2009 at 3:20 PM, <josef.pktd@gmail.com <mailto:josef.pktd@gmail.com>> wrote:
On Wed, Oct 14, 2009 at 4:22 AM, Mark Bakker <markbak@gmail.com <mailto:markbak@gmail.com>> wrote: > Hello list, > I am having trouble creating a lognormal distribution with known mean mu and > standard deviation sigma using scipy.stats > According to the docs, the programmed function is: > lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) > So s is the standard deviation. But how do I specify the mean? I found some > information that when you specify loc and scale, you replace x by > (x-loc)/scale > But in the lognormal distribution, you want to replace log(x) by log(x)-loc > where loc is mu. How do I do that? In addition, would it be a good idea to > create some convenience functions that allow you to simply create lognormal > (and maybe normal) distributions by specifying the more common mu and sigma? > That would surely make things more userfriendly. > Thanks, > Mark
I don't think loc of lognorm makes much sense in most application, since it is just shifting the support, lower boundary is zero+loc. The loc of the underlying normal distribution enters through the scale.
see also http://en.wikipedia.org/wiki/Log-normal_distribution#Mean_and_standard_devia...
>>> print stats.lognorm.extradoc
Lognormal distribution
lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) for x > 0, s > 0.
If log x is normally distributed with mean mu and variance sigma**2, then x is log-normally distributed with shape paramter sigma and scale parameter exp(mu).
roundtrip with mean mu of the underlying normal distribution (scale=1):
>>> mu=np.arange(5) >>> np.log(stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0])-0.5 array([ 0., 1., 2., 3., 4.])
corresponding means of lognormal distribution
>>> stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0] array([ 1.64872127, 4.48168907, 12.18249396, 33.11545196, 90.0171313 ])
shifting support:
>>> stats.lognorm.a 0.0 >>> stats.lognorm.ppf([0, 0.5, 1], 1, loc=3,scale=1) array([ 3., 4., Inf])
The only case that I know for lognormal is in regression, so I'm not sure what you mean by the convenience functions. (the normal distribution is defined by loc=mean, scale=standard deviation)
assume the regression equation is y = x*beta*exp(u) u distributed normal(0, sigma^2) this implies ln y = ln(x*beta) + u which is just a standard linear regression equation which can be estimated by ols or mle
exp(u) in this case is lognormal distributed
Josef _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org <mailto:SciPy-User@scipy.org> http://mail.scipy.org/mailman/listinfo/scipy-user
------------------------------------------------------------------------
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Ok, I had misunderstood that mu and sigma where the mean of the lognormally distributed variable. So, this is what I should have written:
mean = 10.0 variance = 1.0 mean_n = log(mean) - 0.5*log(1 + variance/mean**2) variance_n = log(variance/mean**2 + 1) d = lognorm(sqrt(variance_n), scale=exp(mean_n)) d.stats() (array(10.000000000000002), array(1.0000000000000013))
Thanks, Armando. On Wed, Jul 21, 2010 at 5:48 PM, Warren Weckesser < warren.weckesser@enthought.com> wrote:
Armando Serrano Lombillo wrote:
Hello, I'm also having difficulties with lognorm.
If mu is the mean and s**2 is the variance then...
from scipy.stats import lognorm from math import exp mu = 10 s = 1 d = lognorm(s, scale=exp(mu)) d.stats('m') array(36315.502674246643)
shouldn't that be 10?
In terms of mu and sigma, the mean of the lognormal distribution is exp(mu + 0.5*sigma**2). In your example:
In [16]: exp(10.5) Out[16]: 36315.502674246636
Warren
On Wed, Oct 14, 2009 at 3:20 PM, <josef.pktd@gmail.com <mailto:josef.pktd@gmail.com>> wrote:
On Wed, Oct 14, 2009 at 4:22 AM, Mark Bakker <markbak@gmail.com <mailto:markbak@gmail.com>> wrote: > Hello list, > I am having trouble creating a lognormal distribution with known mean mu and > standard deviation sigma using scipy.stats > According to the docs, the programmed function is: > lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) > So s is the standard deviation. But how do I specify the mean? I found some > information that when you specify loc and scale, you replace x by > (x-loc)/scale > But in the lognormal distribution, you want to replace log(x) by log(x)-loc > where loc is mu. How do I do that? In addition, would it be a good idea to > create some convenience functions that allow you to simply create lognormal > (and maybe normal) distributions by specifying the more common mu and sigma? > That would surely make things more userfriendly. > Thanks, > Mark
I don't think loc of lognorm makes much sense in most application, since it is just shifting the support, lower boundary is zero+loc.
The
loc of the underlying normal distribution enters through the scale.
see also
http://en.wikipedia.org/wiki/Log-normal_distribution#Mean_and_standard_devia...
>>> print stats.lognorm.extradoc
Lognormal distribution
lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) for x > 0, s > 0.
If log x is normally distributed with mean mu and variance sigma**2, then x is log-normally distributed with shape paramter sigma and
scale
parameter exp(mu).
roundtrip with mean mu of the underlying normal distribution (scale=1):
>>> mu=np.arange(5) >>> np.log(stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0])-0.5 array([ 0., 1., 2., 3., 4.])
corresponding means of lognormal distribution
>>> stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0] array([ 1.64872127, 4.48168907, 12.18249396, 33.11545196, 90.0171313 ])
shifting support:
>>> stats.lognorm.a 0.0 >>> stats.lognorm.ppf([0, 0.5, 1], 1, loc=3,scale=1) array([ 3., 4., Inf])
The only case that I know for lognormal is in regression, so I'm not sure what you mean by the convenience functions. (the normal distribution is defined by loc=mean, scale=standard deviation)
assume the regression equation is y = x*beta*exp(u) u distributed normal(0, sigma^2) this implies ln y = ln(x*beta) + u which is just a standard linear regression equation which can be estimated by ols or mle
exp(u) in this case is lognormal distributed
Josef _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org <mailto:SciPy-User@scipy.org> http://mail.scipy.org/mailman/listinfo/scipy-user
------------------------------------------------------------------------
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
participants (4)
-
Armando Serrano Lombillo
-
josef.pktd@gmail.com
-
Mark Bakker
-
Warren Weckesser