generic expectation operator
Hi, For numerous purposes I need to compute the expectation of some function with respect to some probability distribution. i.e. E f(X) = \int f(x) dG(x), where f is the function and G the distribution. When G is continuous, this is simple, just use scipy.integrate.quad. However, if G is discrete, things become more complicated since quad often misses the spikes in the density. To deal with this problem, in part, I came up with the following code, but it is not completely ok yet, see below. #!/usr/bin/env python import scipy.stats as stats import scipy.integrate from math import sqrt def E(f, X): if hasattr(X,'dist'): if isinstance(X.dist, stats.rv_discrete): # hint from Ralph g = X.pmf else: g = X.pdf return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b) else: return sum(f(x)*p for x,p in zip(X.xk, X.pk)) # now the following works: X = stats.norm(3,sqrt(3)) print E(sqrt, X) # this is ok too grid = [50, 100, 150, 200, 250, 300] probs=[2,3,2,1.,0.5,0.5] tot = sum(probs) probs = [p/tot for p in probs] X = stats.rv_discrete(values=(grid,probs), name='sizeDist') print E(sqrt, X) # but this is not ok X = stats.poisson(3) print E(sqrt, X) ------------------------------------------ This is the output: 11.0383463182 (7.771634331185016e-19, 1.916327339868129e-17) (2.9999999999999996, 1.1770368678494054e-08) So indeed, quad misses the spikes in the poisson distribution. Is there any nice way to resolve this? Moreover, is there a better way to build the expectation operator? Perhaps, if all works, it would be a useful add-on to distributions.py, or else I can include it in the distribution tutorial, if other people also think this is useful. thanks Nicky
On Wed, Aug 22, 2012 at 4:49 PM, nicky van foreest <vanforeest@gmail.com> wrote:
Hi,
For numerous purposes I need to compute the expectation of some function with respect to some probability distribution. i.e. E f(X) = \int f(x) dG(x), where f is the function and G the distribution. When G is continuous, this is simple, just use scipy.integrate.quad. However, if G is discrete, things become more complicated since quad often misses the spikes in the density. To deal with this problem, in part, I came up with the following code, but it is not completely ok yet, see below.
#!/usr/bin/env python
import scipy.stats as stats import scipy.integrate from math import sqrt
def E(f, X): if hasattr(X,'dist'): if isinstance(X.dist, stats.rv_discrete): # hint from Ralph g = X.pmf else: g = X.pdf return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b) else: return sum(f(x)*p for x,p in zip(X.xk, X.pk))
# now the following works: X = stats.norm(3,sqrt(3)) print E(sqrt, X)
# this is ok too grid = [50, 100, 150, 200, 250, 300] probs=[2,3,2,1.,0.5,0.5] tot = sum(probs) probs = [p/tot for p in probs] X = stats.rv_discrete(values=(grid,probs), name='sizeDist') print E(sqrt, X)
# but this is not ok
X = stats.poisson(3) print E(sqrt, X)
------------------------------------------
This is the output:
11.0383463182 (7.771634331185016e-19, 1.916327339868129e-17) (2.9999999999999996, 1.1770368678494054e-08)
So indeed, quad misses the spikes in the poisson distribution.
Is there any nice way to resolve this? Moreover, is there a better way to build the expectation operator? Perhaps, if all works, it would be a useful add-on to distributions.py, or else I can include it in the distribution tutorial, if other people also think this is useful.
quad works only for continuous functions did you look at
stats.poisson.expect <bound method poisson_gen.expect of <scipy.stats.distributions.poisson_gen object at 0x0320FC90>> stats.norm.expect <bound method norm_gen.expect of <scipy.stats.distributions.norm_gen object at 0x031EB890>>
improvements welcome, especially making it more robust Josef
thanks
Nicky _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
On Wed, Aug 22, 2012 at 5:03 PM, <josef.pktd@gmail.com> wrote:
On Wed, Aug 22, 2012 at 4:49 PM, nicky van foreest <vanforeest@gmail.com> wrote:
Hi,
For numerous purposes I need to compute the expectation of some function with respect to some probability distribution. i.e. E f(X) = \int f(x) dG(x), where f is the function and G the distribution. When G is continuous, this is simple, just use scipy.integrate.quad. However, if G is discrete, things become more complicated since quad often misses the spikes in the density. To deal with this problem, in part, I came up with the following code, but it is not completely ok yet, see below.
#!/usr/bin/env python
import scipy.stats as stats import scipy.integrate from math import sqrt
def E(f, X): if hasattr(X,'dist'): if isinstance(X.dist, stats.rv_discrete): # hint from Ralph g = X.pmf else: g = X.pdf return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b) else: return sum(f(x)*p for x,p in zip(X.xk, X.pk))
# now the following works: X = stats.norm(3,sqrt(3)) print E(sqrt, X)
I don't know why this works, sqrt of a normal distributed random variable has lots of nans (or complex numbers)
f = lambda x: np.sqrt(np.maximum(x,0)) stats.norm.expect(f, loc=3, scale=np.sqrt(3)) 1.6394037171420484
f = lambda x: np.sqrt(np.abs(x)) stats.norm.expect(f, loc=3, scale=np.sqrt(3)) 1.6708150951924068
# this is ok too grid = [50, 100, 150, 200, 250, 300] probs=[2,3,2,1.,0.5,0.5] tot = sum(probs) probs = [p/tot for p in probs] X = stats.rv_discrete(values=(grid,probs), name='sizeDist') print E(sqrt, X)
# but this is not ok
X = stats.poisson(3) print E(sqrt, X)
stats.poisson.expect(np.sqrt, args=(3,)) 1.6309535375094115
Josef
------------------------------------------
This is the output:
11.0383463182 (7.771634331185016e-19, 1.916327339868129e-17) (2.9999999999999996, 1.1770368678494054e-08)
So indeed, quad misses the spikes in the poisson distribution.
Is there any nice way to resolve this? Moreover, is there a better way to build the expectation operator? Perhaps, if all works, it would be a useful add-on to distributions.py, or else I can include it in the distribution tutorial, if other people also think this is useful.
quad works only for continuous functions
did you look at
stats.poisson.expect <bound method poisson_gen.expect of <scipy.stats.distributions.poisson_gen object at 0x0320FC90>> stats.norm.expect <bound method norm_gen.expect of <scipy.stats.distributions.norm_gen object at 0x031EB890>>
improvements welcome, especially making it more robust
Josef
thanks
Nicky _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Hi Josef, Thanks for your answers.
return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b)
X = stats.norm(3,sqrt(3)) print E(sqrt, X)
I don't know why this works, sqrt of a normal distributed random variable has lots of nans (or complex numbers)
You are right, it shouldn't work. The point is that my code above is this: lambda x: x*g(x), while it should have been lambda x: f(x)*g(x). I'll give expect a try. Nicky
Hi, I noticed that rv_frozen does not have an expect attribute. Is there a design reason for this? It feels somewhat unnatural to me that frozen distributions cannot be used directly to compute expectations. bye Nicky On 23 August 2012 18:36, nicky van foreest <vanforeest@gmail.com> wrote:
Hi Josef,
Thanks for your answers.
return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b)
X = stats.norm(3,sqrt(3)) print E(sqrt, X)
I don't know why this works, sqrt of a normal distributed random variable has lots of nans (or complex numbers)
You are right, it shouldn't work. The point is that my code above is this: lambda x: x*g(x), while it should have been lambda x: f(x)*g(x).
I'll give expect a try.
Nicky
On Thu, Aug 23, 2012 at 1:11 PM, nicky van foreest <vanforeest@gmail.com> wrote:
Hi,
I noticed that rv_frozen does not have an expect attribute. Is there a design reason for this? It feels somewhat unnatural to me that frozen distributions cannot be used directly to compute expectations.
oversight to add it, I guess. can be added and it can delegate like in the other methods. PR welcome. frozen_dist.dist.expect(..) (I almost never work with frozen distribution and didn't see this.) Josef
bye
Nicky
On 23 August 2012 18:36, nicky van foreest <vanforeest@gmail.com> wrote:
Hi Josef,
Thanks for your answers.
return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b)
X = stats.norm(3,sqrt(3)) print E(sqrt, X)
I don't know why this works, sqrt of a normal distributed random variable has lots of nans (or complex numbers)
You are right, it shouldn't work. The point is that my code above is this: lambda x: x*g(x), while it should have been lambda x: f(x)*g(x).
I'll give expect a try.
Nicky
SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
On Thu, Aug 23, 2012 at 1:24 PM, <josef.pktd@gmail.com> wrote:
On Thu, Aug 23, 2012 at 1:11 PM, nicky van foreest <vanforeest@gmail.com> wrote:
Hi,
I noticed that rv_frozen does not have an expect attribute. Is there a design reason for this? It feels somewhat unnatural to me that frozen distributions cannot be used directly to compute expectations.
oversight to add it, I guess.
can be added and it can delegate like in the other methods. PR welcome. frozen_dist.dist.expect(..)
(I almost never work with frozen distribution and didn't see this.)
miss = set(dir(stats.norm)) - set(dir(stats.norm(loc=5))) [i for i in miss if not i[0]=='_'] ['moment_type', 'xb', 'xa', 'freeze', 'badvalue', 'expect', 'xtol', 'vecentropy', 'fit_loc_scale', 'expandarr', 'fit', 'vecfunc', 'est_loc_scale', 'logsf', 'logcdf', 'logpdf', 'generic_moment', 'numargs', 'name', 'a', 'nnlf', 'b', 'm', 'shapes', 'extradoc', 'veccdf'] import scipy scipy.__version__ '0.9.0'
'logsf', 'logcdf', 'logpdf', are also missing (in my version of scipy) Josef
Josef
bye
Nicky
On 23 August 2012 18:36, nicky van foreest <vanforeest@gmail.com> wrote:
Hi Josef,
Thanks for your answers.
return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b)
X = stats.norm(3,sqrt(3)) print E(sqrt, X)
I don't know why this works, sqrt of a normal distributed random variable has lots of nans (or complex numbers)
You are right, it shouldn't work. The point is that my code above is this: lambda x: x*g(x), while it should have been lambda x: f(x)*g(x).
I'll give expect a try.
Nicky
SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Hi Josef,
oversight to add it, I guess.
Ok.
can be added and it can delegate like in the other methods. PR welcome. frozen_dist.dist.expect(..)
oops. I saw this request coming. While it is trivial to implement (nearly a copy paste from the other examples in rv_frozen), I have to admit that working with git takes me a considerable amount of time, as I hardly use it.
(I almost never work with frozen distribution and didn't see this.)
Is there a specific reason for this? I find it very convenient. For the moment I solve the problem with the expectation with this function: def expect(X, f, lb, ub): if hasattr(X, 'dist'): return X.dist.expect(f, lb = lb, ub = ub) else: return X.expect(f, lb = lb, ub = ub) Like this I can pass frozen distributions and the ones that I make with rv_discrete. As your suggestion above is not yet part of standard scipy, I 'll use this function to show my students how convenient python is. Nicky
On Thu, Aug 23, 2012 at 2:09 PM, nicky van foreest <vanforeest@gmail.com> wrote:
Hi Josef,
oversight to add it, I guess.
Ok.
can be added and it can delegate like in the other methods. PR welcome. frozen_dist.dist.expect(..)
oops. I saw this request coming. While it is trivial to implement (nearly a copy paste from the other examples in rv_frozen), I have to admit that working with git takes me a considerable amount of time, as I hardly use it.
I'm using mostly git gui so I can keep the number of git commands, sub-commands and options that I have to remember or to look up to a small number. Then it's ok.
(I almost never work with frozen distribution and didn't see this.)
Is there a specific reason for this? I find it very convenient.
habit, unnecessary detour and useless for fitting.
For the moment I solve the problem with the expectation with this function:
def expect(X, f, lb, ub): if hasattr(X, 'dist'): return X.dist.expect(f, lb = lb, ub = ub) else: return X.expect(f, lb = lb, ub = ub)
Like this I can pass frozen distributions and the ones that I make with rv_discrete. As your suggestion above is not yet part of standard scipy, I 'll use this function to show my students how convenient python is.
If you run into problems with the discrete expect, then please report them or file a ticket (if not a PR). I have no idea how much usage it has seen, and it's based on a truncated sum, that has problems in some cases (edge cases or cases where tails are important). Josef
Nicky _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
participants (2)
-
josef.pktd@gmail.com -
nicky van foreest