Request for usage examples on scipy.stats.rv_continuous and scipy.ndimage.grey_dilate(structure)
Hi, I am having some trouble figuring out how to use the following two aspects of scipy: 1) How can I create a random distribution given by some python formula, e.g. pdf(x) = x**2 in the range x = 1 to 2. The example given in the rv_continuous docstring doesn't work for me: In [1]: import matplotlib.pyplot as plt In [2]: numargs = generic.numargs AttributeError: type object 'numpy.generic' has no attribute 'numargs' Is it also possible to use rv_continuous for multidimensional distributions, say pdf(x,y)=x*y in the range x = 1 to 2 and y = 0 to 3? Or for pdfs that are just given numerically on a grid (say as the result of a gaussian_kde() ), not by a python function? It would be nice if someone could add a few examples that show rv_continuous usage to the docstring or tutorial. 2) How to use the 'structure' argument of scipy.ndimage.grey_dilate() correctly? There are some explanations at http://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html but I think they are inconsistent with the implementation: -------------------------------------------------------------- In [98]: a = zeros((7,7)); a[3,3]=1; a Out[98]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]]) In [100]: s = generate_binary_structure(2,2); s Out[100]: array([[ True, True, True], [ True, True, True], [ True, True, True]], dtype=bool) In [101]: grey_dilation(a,size=(3,3)) Out[101]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]]) In [102]: grey_dilation(a,size=(3,3),structure=s) Out[102]: array([[ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.]]) In [103]: grey_dilation(a,structure=s) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /Users/deil/<ipython console> in <module>() /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/ndimage/morphology.pyc in grey_dilation(input, size, footprint, structure, output, mode, cval, origin) 354 sz = footprint.shape[ii] 355 else: --> 356 sz = size[ii] 357 if not sz & 1: 358 origin[ii] -= 1 TypeError: 'NoneType' object is unsubscriptable In [104]: grey_dilation(a,footprint=s) Out[104]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]]) -------------------------------------------------------------- Why doesn't grey_dilation(a,structure=s) work? Why doesn't grey_dilation(a,size=(3,3),structure=s) give the same result as grey_dilation(a,size=(3,3),footprint=s)? I am running on numpy version 1.4.0 and scipy version 0.7.1 Thanks! --Christoph
On Mon, Mar 22, 2010 at 09:44, Christoph Deil <Deil.Christoph@googlemail.com> wrote:
Hi,
I am having some trouble figuring out how to use the following two aspects of scipy:
1) How can I create a random distribution given by some python formula, e.g. pdf(x) = x**2 in the range x = 1 to 2.
The example given in the rv_continuous docstring doesn't work for me: In [1]: import matplotlib.pyplot as plt In [2]: numargs = generic.numargs AttributeError: type object 'numpy.generic' has no attribute 'numargs'
Is it also possible to use rv_continuous for multidimensional distributions, say pdf(x,y)=x*y in the range x = 1 to 2 and y = 0 to 3? Or for pdfs that are just given numerically on a grid (say as the result of a gaussian_kde() ), not by a python function?
It would be nice if someone could add a few examples that show rv_continuous usage to the docstring or tutorial.
rv_continuous is a base class. You do not use it directly. The reason the docstring doesn't work is because it is used as a template to build the docstrings for the subclasses. If you want to subclass from it to make your own distribution class, read the source of the other distribution classes for examples. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Mon, Mar 22, 2010 at 10:50 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Mon, Mar 22, 2010 at 09:44, Christoph Deil
It would be nice if someone could add a few examples that show rv_continuous usage to the docstring or tutorial.
rv_continuous is a base class. You do not use it directly. The reason the docstring doesn't work is because it is used as a template to build the docstrings for the subclasses. If you want to subclass from it to make your own distribution class, read the source of the other distribution classes for examples.
That reminds me of http://projects.scipy.org/scipy/ticket/1055. That will enable a proper docstring for rv_continuous that explains this.
Josef, if I split up the patch into a couple of logical units, will you accept it? Cheers, Ralf
Dear Robert, thanks for the tip. I tried understanding the examples in scipy/stats/distributions.py, but being a python / scipy newbie I find the mechanism hard to understand and couldn't implement the simple examples I suggested below. Maybe it would be possible to add an example to the tutorial? At http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/#stats there is an example on how to use rv_discrete, but none on how to use rv_continuous. Would it be possible to add a convenience function to scipy.stats that makes it easy to construct a distribution from a function:
p = lambda x: x**2 pdist = scipy.stats.rv_continuous_from_function(pdf=p, lim=[0,2]) # a suggestion, doesn't exist at the moment samples = pdist.rvs(size=10)
I would guess that getting random numbers from a user defined distribution function is such a common usage that it would be nice (at least for newbies like me :-) to being able to do it from the command line, without having to derive a class. Cheers, --Christoph On Mar 22, 2010, at 3:50 PM, Robert Kern wrote:
On Mon, Mar 22, 2010 at 09:44, Christoph Deil <Deil.Christoph@googlemail.com> wrote:
Hi,
I am having some trouble figuring out how to use the following two aspects of scipy:
1) How can I create a random distribution given by some python formula, e.g. pdf(x) = x**2 in the range x = 1 to 2.
The example given in the rv_continuous docstring doesn't work for me: In [1]: import matplotlib.pyplot as plt In [2]: numargs = generic.numargs AttributeError: type object 'numpy.generic' has no attribute 'numargs'
Is it also possible to use rv_continuous for multidimensional distributions, say pdf(x,y)=x*y in the range x = 1 to 2 and y = 0 to 3? Or for pdfs that are just given numerically on a grid (say as the result of a gaussian_kde() ), not by a python function?
It would be nice if someone could add a few examples that show rv_continuous usage to the docstring or tutorial.
rv_continuous is a base class. You do not use it directly. The reason the docstring doesn't work is because it is used as a template to build the docstrings for the subclasses. If you want to subclass from it to make your own distribution class, read the source of the other distribution classes for examples.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
On Mon, Mar 22, 2010 at 11:58, Christoph Deil <Deil.Christoph@googlemail.com> wrote:
Dear Robert,
thanks for the tip. I tried understanding the examples in scipy/stats/distributions.py, but being a python / scipy newbie I find the mechanism hard to understand and couldn't implement the simple examples I suggested below.
What confused you?
Maybe it would be possible to add an example to the tutorial? At http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/#stats there is an example on how to use rv_discrete, but none on how to use rv_continuous.
Would it be possible to add a convenience function to scipy.stats that makes it easy to construct a distribution from a function:
p = lambda x: x**2 pdist = scipy.stats.rv_continuous_from_function(pdf=p, lim=[0,2]) # a suggestion, doesn't exist at the moment samples = pdist.rvs(size=10)
class x2_gen(rv_continuous): def _pdf(self, x): return x * x * 0.375 x2 = x2_gen(a=0.0, b=2.0, name='x2')
I would guess that getting random numbers from a user defined distribution function is such a common usage that it would be nice (at least for newbies like me :-) to being able to do it from the command line, without having to derive a class.
It's not particularly common, no. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Mar 22, 2010, at 6:13 PM, Robert Kern wrote:
On Mon, Mar 22, 2010 at 11:58, Christoph Deil <Deil.Christoph@googlemail.com> wrote:
Dear Robert,
thanks for the tip. I tried understanding the examples in scipy/stats/distributions.py, but being a python / scipy newbie I find the mechanism hard to understand and couldn't implement the simple examples I suggested below.
What confused you?
I didn't know how to specify the limits. The rv_continuous docstring says: Constructor information: Definition: scipy.stats.rv_continuous(self, momtype=1, a=None, b=None, xa=-10.0, xb=10.0, xtol=1e-14, badvalue=None, name=None, longname=None, shapes=None, extradoc=None) The meaning of the parameters a, b, xa, xb doesn't seem to be documented. Now that I had a look at the source code it's obvious.
Maybe it would be possible to add an example to the tutorial? At http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/#stats there is an example on how to use rv_discrete, but none on how to use rv_continuous.
Would it be possible to add a convenience function to scipy.stats that makes it easy to construct a distribution from a function:
p = lambda x: x**2 pdist = scipy.stats.rv_continuous_from_function(pdf=p, lim=[0,2]) # a suggestion, doesn't exist at the moment samples = pdist.rvs(size=10)
class x2_gen(rv_continuous): def _pdf(self, x): return x * x * 0.375
x2 = x2_gen(a=0.0, b=2.0, name='x2')
Thanks! This works. Multidimensional correlated parameter distributions like pdf(x,y) = x*y cannot be implemented using rv_continuous, right? If yes, is there a python module to work with correlated multidimensional pdfs?
I would guess that getting random numbers from a user defined distribution function is such a common usage that it would be nice (at least for newbies like me :-) to being able to do it from the command line, without having to derive a class.
It's not particularly common, no.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
On Mon, Mar 22, 2010 at 12:47, Christoph Deil <Deil.Christoph@googlemail.com> wrote:
Multidimensional correlated parameter distributions like pdf(x,y) = x*y cannot be implemented using rv_continuous, right?
Correct; rv_continuous does not handle multidimensional distributions.
If yes, is there a python module to work with correlated multidimensional pdfs?
I don't think there are any frameworks out there along the lines of rv_continuous. There is code for sampling certain multivariate distributions in numpy.random. If you need to sample from an arbitrary multidimensional distribution, you will need to look at PyMC for doing Markov-chain Monte Carlo. http://code.google.com/p/pymc/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Hi there, I have histograms that will form the likelihood function and need to find the best normal distributions. I would like to know if it would be possible to create my input array to rv_continuous.fit using the data of my histograms. Kind regards, Sep
On Dec 5, 2012, at 10:11 AM, Sepideh Rastin <sepideh@isc.ac.uk> wrote:
Hi there, I have histograms that will form the likelihood function and need to find the best normal distributions. I would like to know if it would be possible to create my input array to rv_continuous.fit using the data of my histograms.
If you have the raw data, you can just use mean() and std() functions. E.g., from pylab import * n = 1000 mu = 2 sigma = 0.3 def G(x,m,s): return exp(-((x-m)/s)**2/2)/sqrt(2*pi*s**2) # fake data s = randn(n)*sigma + mu # estimates mu_hat, sigma_hat = mean(s), std(s) # plot t = linspace(mu-3*sigma,mu+3*sigma,200) bins = linspace(t[0],t[-1],21) step = bins[1]-bins[0] hist(s,bins=bins) plot(t,G(t,mu_hat,sigma_hat)*n*step,'g',hold=True) show() - Paul Paul Kienzle pkienzle@nist.gov
On Wed, Dec 5, 2012 at 10:11 AM, Sepideh Rastin <sepideh@isc.ac.uk> wrote:
Hi there, I have histograms that will form the likelihood function and need to find the best normal distributions. I would like to know if it would be possible to create my input array to rv_continuous.fit using the data of my histograms.
several answers No, you cannot use the histogram for fit() directly. stats.norm.fit() expects the original data with individual observations. It's possible to create an artifical dataset by just creating observations based on the histogramm. np.repeat(bin_centers, bin_counts) or something like this. fitting a normal distribution is "boring": sample mean and standard deviation are estimates of loc and scale. There are several scripts on the web, scipy cookbook, scipy central, ... ? how to fit a normal pdf directly to a histogram. If you have only a small number of bins, then using the above will cause a discretization bias (reference ?). In this case, I would fit either the histogram or the cumulative histogram to the discrete probabilities from the discretization, for example something like minimizing def fun(cumhist, bins_edges, loc, scale): diff = cumhist - stats.norm.cdf(bin_edges, loc=loc, scale=scale) #fix or drop first element return (diff**2).sum() Josef
Kind regards, Sep _______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Hi, The documentation of ndimage.grey_dilate is much more detailed on the doc wiki (including the use of structure, footprint etc.) http://docs.scipy.org/scipy/docs/scipy.ndimage.morphology.grey_dilation/ (like most other functions of ndimage.morphology, the docstring has been edited quite recently and the changes haven't made their way to the "official" doc yet). Cheers, Emmanuelle On Mon, Mar 22, 2010 at 03:44:41PM +0100, Christoph Deil wrote:
Hi,
I am having some trouble figuring out how to use the following two aspects of scipy:
1) How can I create a random distribution given by some python formula, e.g. pdf(x) = x**2 in the range x = 1 to 2.
The example given in the rv_continuous docstring doesn't work for me: In [1]: import matplotlib.pyplot as plt In [2]: numargs = generic.numargs AttributeError: type object 'numpy.generic' has no attribute 'numargs'
Is it also possible to use rv_continuous for multidimensional distributions, say pdf(x,y)=x*y in the range x = 1 to 2 and y = 0 to 3? Or for pdfs that are just given numerically on a grid (say as the result of a gaussian_kde() ), not by a python function?
It would be nice if someone could add a few examples that show rv_continuous usage to the docstring or tutorial.
2) How to use the 'structure' argument of scipy.ndimage.grey_dilate() correctly? There are some explanations at http://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html but I think they are inconsistent with the implementation:
-------------------------------------------------------------- In [98]: a = zeros((7,7)); a[3,3]=1; a Out[98]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]]) In [100]: s = generate_binary_structure(2,2); s Out[100]: array([[ True, True, True], [ True, True, True], [ True, True, True]], dtype=bool)
In [101]: grey_dilation(a,size=(3,3)) Out[101]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]])
In [102]: grey_dilation(a,size=(3,3),structure=s) Out[102]: array([[ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.]])
In [103]: grey_dilation(a,structure=s) --------------------------------------------------------------------------- TypeError Traceback (most recent call last)
/Users/deil/<ipython console> in <module>()
/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/ndimage/morphology.pyc in grey_dilation(input, size, footprint, structure, output, mode, cval, origin) 354 sz = footprint.shape[ii] 355 else: --> 356 sz = size[ii] 357 if not sz & 1: 358 origin[ii] -= 1
TypeError: 'NoneType' object is unsubscriptable
In [104]: grey_dilation(a,footprint=s) Out[104]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]]) --------------------------------------------------------------
Why doesn't grey_dilation(a,structure=s) work? Why doesn't grey_dilation(a,size=(3,3),structure=s) give the same result as grey_dilation(a,size=(3,3),footprint=s)?
I am running on numpy version 1.4.0 and scipy version 0.7.1
Thanks! --Christoph
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
2) How to use the 'structure' argument of scipy.ndimage.grey_dilate() correctly? There are some explanations at http://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html but I think they are inconsistent with the implementation:
Oh dear, I think this is a bug that was previously reported by Tony Yu months ago (along with a patch!), which I said I would open a ticket about and look over the patch provided, but I seem to have entirely forgotten about (because I'm a flake). http://projects.scipy.org/scipy/ticket/1135 Could someone who uses grey_dilation frequently (any cellprofiler people?) check the proposed changes over and make sure it makes sense in the grey case? The tests are only for binary cases. Zach On Mar 22, 2010, at 10:44 AM, Christoph Deil wrote:
Hi,
I am having some trouble figuring out how to use the following two aspects of scipy:
1) How can I create a random distribution given by some python formula, e.g. pdf(x) = x**2 in the range x = 1 to 2.
The example given in the rv_continuous docstring doesn't work for me: In [1]: import matplotlib.pyplot as plt In [2]: numargs = generic.numargs AttributeError: type object 'numpy.generic' has no attribute 'numargs'
Is it also possible to use rv_continuous for multidimensional distributions, say pdf(x,y)=x*y in the range x = 1 to 2 and y = 0 to 3? Or for pdfs that are just given numerically on a grid (say as the result of a gaussian_kde() ), not by a python function?
It would be nice if someone could add a few examples that show rv_continuous usage to the docstring or tutorial.
2) How to use the 'structure' argument of scipy.ndimage.grey_dilate() correctly? There are some explanations at http://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html but I think they are inconsistent with the implementation:
-------------------------------------------------------------- In [98]: a = zeros((7,7)); a[3,3]=1; a Out[98]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]]) In [100]: s = generate_binary_structure(2,2); s Out[100]: array([[ True, True, True], [ True, True, True], [ True, True, True]], dtype=bool)
In [101]: grey_dilation(a,size=(3,3)) Out[101]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]])
In [102]: grey_dilation(a,size=(3,3),structure=s) Out[102]: array([[ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 2., 2., 2., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1., 1., 1.]])
In [103]: grey_dilation(a,structure=s) --------------------------------------------------------------------------- TypeError Traceback (most recent call last)
/Users/deil/<ipython console> in <module>()
/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/ python2.6/site-packages/scipy/ndimage/morphology.pyc in grey_dilation(input, size, footprint, structure, output, mode, cval, origin) 354 sz = footprint.shape[ii] 355 else: --> 356 sz = size[ii] 357 if not sz & 1: 358 origin[ii] -= 1
TypeError: 'NoneType' object is unsubscriptable
In [104]: grey_dilation(a,footprint=s) Out[104]: array([[ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 1., 1., 1., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0.]]) --------------------------------------------------------------
Why doesn't grey_dilation(a,structure=s) work? Why doesn't grey_dilation(a,size=(3,3),structure=s) give the same result as grey_dilation(a,size=(3,3),footprint=s)?
I am running on numpy version 1.4.0 and scipy version 0.7.1
Thanks! --Christoph
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
participants (8)
-
Christoph Deil -
Emmanuelle Gouillart -
josef.pktd@gmail.com -
Paul Kienzle -
Ralf Gommers -
Robert Kern -
Sepideh Rastin -
Zachary Pincus