Mailman 3 Conditional random variables - NumPy-Discussion - python.org

newer
Large numbers into float128

Conditional random variables

older
ANN: PyTables 2.3.1 released

Ted To

July 5, 2011

2:13 a.m.

Hi, Is there an easy way to make random draws from a conditional random variable? E.g., draw a random variable, x conditional on x>=\bar x. Thank you, Ted To

Reply

Sign in to reply online Use email software

Show replies by date

josef.pktd＠gmail.com

July 2011

2:17 p.m.

On Mon, Jul 4, 2011 at 10:13 PM, Ted To <rainexpected@theo.to> wrote:

If you mean here truncated distribution, then I asked a similar question on the scipy user list a month ago for the normal distribution. The answer was use rejection sampling, Gibbs or MCMC. I just sample from the original distribution and throw away those values that are not in the desired range. This works fine if there is only a small truncation, but not so well for distribution with support only in the tails. It's reasonably fast for distributions that numpy.random produces relatively fast. (Having a bi- or multi-variate distribution and sampling y conditional on given x sounds more "fun".) Josef

Reply

Sign in to reply online Use email software

Ted To

2:33 p.m.

On 07/05/2011 10:17 AM, josef.pktd@gmail.com wrote:

Yes, that is what I had been doing but in some cases my truncations moves into the upper tail and it takes an extraordinary amount of time. I found that I could use scipy.stats.truncnorm but I haven't yet figured out how to use it for a joint distribution. E.g., I have 2 normal rv's X and Y from which I would like to draw X and Y where X+Y>= U. Any suggestions? Cheers, Ted To

Reply

Sign in to reply online Use email software

josef.pktd＠gmail.com

3:07 p.m.

On Tue, Jul 5, 2011 at 10:33 AM, Ted To <rainexpected@theo.to> wrote:

If you only need to sample the sum Z=X+Y, then it would be just a univariate normal again (in Z). For the general case, I'm at least a month away from being able to sample from a generic multivariate distribution. There is an integral transform that does recursive conditioning y|x. (like F^{-1} transform for multivariate distributions, used for example for copulas) For example sample x>=U and then sample y>=u-x. That's two univariate normal samples. Another trick I used for the tail is to take the absolute value around the mean, because of symmetry you get twice as many valid samples. I also never tried importance sampling and the other biased sampling procedures. If you find something, then I'm also interested in a solution. Cheers, Josef

Reply

Sign in to reply online Use email software

Ted To

4:26 p.m.

On 07/05/2011 11:07 AM, josef.pktd@gmail.com wrote:

For example sample x>=U and then sample y>=u-x. That's two univariate normal samples.

Ah, that's what I was looking for! Many thanks!

Reply

Sign in to reply online Use email software

josef.pktd＠gmail.com

5:36 p.m.

On Tue, Jul 5, 2011 at 12:26 PM, Ted To <rainexpected@theo.to> wrote:

just in case I wasn't clear, if x and y are correlated, then y: y>u-x needs to be sampled from the conditional distribution y|x http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_di... Josef

Reply

Sign in to reply online Use email software

Janos

October 2011

9:11 p.m.

Ted To <rainexpected <at> theo.to> writes:

You need to be careful, though - if you just sample x|x>=u and then sample y|y>=u-x then you'll get the wrong distribution unless x|x>=u has the same distribution as x|x+y>=u, which is false. What you should actually do if you want draws from (x,y)|x+y>=u is first sample (x+y)|(x+y)>=u, and then x|x+y, and then compute y=(x+y)-x. If x~N(mu_x, sigma_x^2) and y~N(mu_y, sigma_y^2) with correlation rho, then x+y~N(mu_x+mu_y, sigma_x^2+sigma_y^2+2*rho*sigma_x*sigma_y), and x|x+y~N(mu_x+r*(x+y-mu_x-mu_y), sigma_x^2*(1-r^2)), where r=cor(x,x+y)=(1+(1-rho^2)(rho+sigma_x/sigma_y)^-2)^(-1/2).

Reply

Sign in to reply online Use email software

josef.pktd＠gmail.com

July 2011

2:17 p.m.

On Mon, Jul 4, 2011 at 10:13 PM, Ted To <rainexpected@theo.to> wrote:

If you mean here truncated distribution, then I asked a similar question on the scipy user list a month ago for the normal distribution. The answer was use rejection sampling, Gibbs or MCMC. I just sample from the original distribution and throw away those values that are not in the desired range. This works fine if there is only a small truncation, but not so well for distribution with support only in the tails. It's reasonably fast for distributions that numpy.random produces relatively fast. (Having a bi- or multi-variate distribution and sampling y conditional on given x sounds more "fun".) Josef

Reply

Sign in to reply online Use email software

Ted To

2:33 p.m.

On 07/05/2011 10:17 AM, josef.pktd@gmail.com wrote:

Yes, that is what I had been doing but in some cases my truncations moves into the upper tail and it takes an extraordinary amount of time. I found that I could use scipy.stats.truncnorm but I haven't yet figured out how to use it for a joint distribution. E.g., I have 2 normal rv's X and Y from which I would like to draw X and Y where X+Y>= U. Any suggestions? Cheers, Ted To

Reply

Sign in to reply online Use email software

josef.pktd＠gmail.com

3:07 p.m.

On Tue, Jul 5, 2011 at 10:33 AM, Ted To <rainexpected@theo.to> wrote:

If you only need to sample the sum Z=X+Y, then it would be just a univariate normal again (in Z). For the general case, I'm at least a month away from being able to sample from a generic multivariate distribution. There is an integral transform that does recursive conditioning y|x. (like F^{-1} transform for multivariate distributions, used for example for copulas) For example sample x>=U and then sample y>=u-x. That's two univariate normal samples. Another trick I used for the tail is to take the absolute value around the mean, because of symmetry you get twice as many valid samples. I also never tried importance sampling and the other biased sampling procedures. If you find something, then I'm also interested in a solution. Cheers, Josef

Reply

Sign in to reply online Use email software

Ted To

4:26 p.m.

On 07/05/2011 11:07 AM, josef.pktd@gmail.com wrote:

For example sample x>=U and then sample y>=u-x. That's two univariate normal samples.

Ah, that's what I was looking for! Many thanks!

Reply

Sign in to reply online Use email software

josef.pktd＠gmail.com

5:36 p.m.

On Tue, Jul 5, 2011 at 12:26 PM, Ted To <rainexpected@theo.to> wrote:

just in case I wasn't clear, if x and y are correlated, then y: y>u-x needs to be sampled from the conditional distribution y|x http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_di... Josef

Reply

Sign in to reply online Use email software

Janos

October 2011

9:11 p.m.

Ted To <rainexpected <at> theo.to> writes:

You need to be careful, though - if you just sample x|x>=u and then sample y|y>=u-x then you'll get the wrong distribution unless x|x>=u has the same distribution as x|x+y>=u, which is false. What you should actually do if you want draws from (x,y)|x+y>=u is first sample (x+y)|(x+y)>=u, and then x|x+y, and then compute y=(x+y)-x. If x~N(mu_x, sigma_x^2) and y~N(mu_y, sigma_y^2) with correlation rho, then x+y~N(mu_x+mu_y, sigma_x^2+sigma_y^2+2*rho*sigma_x*sigma_y), and x|x+y~N(mu_x+r*(x+y-mu_x-mu_y), sigma_x^2*(1-r^2)), where r=cor(x,x+y)=(1+(1-rho^2)(rho+sigma_x/sigma_y)^-2)^(-1/2).

Reply

Sign in to reply online Use email software

4859

Age (days ago)

4975

Last active (days ago)

Download

6 comments

3 participants

tags

participants (3)

Janos
josef.pktd＠gmail.com
Ted To