[Numpy-discussion] numpy.power -> numpy.random.choice Probabilities don't sum to 1

Fri Dec 18 20:00:05 EST 2015

On Fri, Dec 18, 2015 at 1:25 PM, Ryan R. Rosario <ryan at bytemining.com> wrote:
> Hi,
>
> I have a matrix whose entries I must raise to a certain power and then normalize by row. After I do that, when I pass some rows to numpy.random.choice, I get a ValueError: probabilities do not sum to 1.
>
> I understand that floating point is not perfect, and my matrix is so large that I cannot use np.longdouble because I will run out of RAM.
>
> As an example on a smaller matrix:
>
> np.power(mymatrix, 10, out=mymatrix)
> row_normalized = np.apply_along_axis(lambda x: x / np.sum(x), 1, mymatrix)

I'm sorry I don't have a solution to your actual problem off the top
of my head, but it's probably helpful in general to know that a better
way to write this would be just

  row_normalized = mymatrix / np.sum(mymatrix, axis=1, keepdims=True)

apply_along_axis is slow and can almost always be replaced by a
broadcasting expression like this.

> sums = row_normalized.sum(axis=1)
> sums[np.where(sums != 1)]

And here you can just write

  sums[sums != 1]

i.e. the call to where() isn't doing anything useful.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org