
Hi All, I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do np.dot(A,B), I get [nan, nan, nan]. However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False. And also np.seterr(all='print') does not print anything. I am not wondering what is going on and how to avoid. In case it is important, A and B are from the normal equation of doing regression. I am regressing 100,000 observations on 3 100,000 long factors. Thanks, Tom

On 8/24/13, Tom Bennett <tom.bennett@mail.zyzhu.net> wrote:
Hi All,
I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do np.dot(A,B), I get [nan, nan, nan].
However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False. And also np.seterr(all='print') does not print anything.
I am not wondering what is going on and how to avoid.
In case it is important, A and B are from the normal equation of doing regression. I am regressing 100,000 observations on 3 100,000 long factors.
Thanks, Tom
What are the data types of the arrays, and what are the typical sizes of the values in these arrays? I can get all nans from np.dot if the values are huge floating point values: ``` In [79]: x = 1e160*np.random.randn(3, 10) In [80]: y = 1e160*np.random.randn(10) In [81]: np.dot(x, y) Out[81]: array([ nan, nan, nan]) ``` Warren

On 8/24/13, Warren Weckesser <warren.weckesser@gmail.com> wrote:
On 8/24/13, Tom Bennett <tom.bennett@mail.zyzhu.net> wrote:
Hi All,
I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do np.dot(A,B), I get [nan, nan, nan].
However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False. And also np.seterr(all='print') does not print anything.
I am not wondering what is going on and how to avoid.
In case it is important, A and B are from the normal equation of doing regression. I am regressing 100,000 observations on 3 100,000 long factors.
Thanks, Tom
What are the data types of the arrays, and what are the typical sizes of the values in these arrays? I can get all nans from np.dot if the values are huge floating point values:
``` In [79]: x = 1e160*np.random.randn(3, 10)
In [80]: y = 1e160*np.random.randn(10)
In [81]: np.dot(x, y) Out[81]: array([ nan, nan, nan]) ```
...and that happens because some intermediate terms overflow to inf or -inf, and adding these gives nan: ``` In [89]: x = np.array([1e300]) In [90]: y = np.array([1e10]) In [91]: np.dot(x,y) Out[91]: inf In [92]: x2 = np.array([1e300, 1e300]) In [93]: y2 = np.array([1e10,-1e10]) In [94]: np.dot(x2, y2) Out[94]: nan ``` Warren
Warren

Hi Warren, Yes you are absolutely right. I had some values close to log(x), where x is almost 0. That caused the problem. Thanks, Tom On Sat, Aug 24, 2013 at 12:39 PM, Warren Weckesser < warren.weckesser@gmail.com> wrote:
On 8/24/13, Warren Weckesser <warren.weckesser@gmail.com> wrote:
On 8/24/13, Tom Bennett <tom.bennett@mail.zyzhu.net> wrote:
Hi All,
I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do np.dot(A,B), I get [nan, nan, nan].
However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False. And also np.seterr(all='print') does not print anything.
I am not wondering what is going on and how to avoid.
In case it is important, A and B are from the normal equation of doing regression. I am regressing 100,000 observations on 3 100,000 long factors.
Thanks, Tom
What are the data types of the arrays, and what are the typical sizes of the values in these arrays? I can get all nans from np.dot if the values are huge floating point values:
``` In [79]: x = 1e160*np.random.randn(3, 10)
In [80]: y = 1e160*np.random.randn(10)
In [81]: np.dot(x, y) Out[81]: array([ nan, nan, nan]) ```
...and that happens because some intermediate terms overflow to inf or -inf, and adding these gives nan:
``` In [89]: x = np.array([1e300])
In [90]: y = np.array([1e10])
In [91]: np.dot(x,y) Out[91]: inf
In [92]: x2 = np.array([1e300, 1e300])
In [93]: y2 = np.array([1e10,-1e10])
In [94]: np.dot(x2, y2) Out[94]: nan ```
Warren
Warren
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On 8/24/13, Tom Bennett <tom.bennett@mail.zyzhu.net> wrote:
Hi Warren,
Yes you are absolutely right. I had some values close to log(x), where x is almost 0. That caused the problem.
Thanks, Tom
Now the question is: why does `np.dot` mask the overflow warning? In numpy 1.7.1, the default is that overflow should generate a warning: In [1]: np.seterr() Out[1]: {'divide': 'warn', 'invalid': 'warn', 'over': 'warn', 'under': 'ignore'} But `np.dot` does not generate a warning: In [2]: x = np.array([1e300]) In [3]: y = np.array([1e10]) In [4]: np.dot(x, y) Out[4]: inf Multiplying `x` and `y` generates the warning, as expected: In [5]: x*y /home/warren/anaconda/bin/ipython:1: RuntimeWarning: overflow encountered in multiply #!/home/warren/anaconda/bin/python Out[5]: array([ inf]) Warren
On Sat, Aug 24, 2013 at 12:39 PM, Warren Weckesser < warren.weckesser@gmail.com> wrote:
On 8/24/13, Warren Weckesser <warren.weckesser@gmail.com> wrote:
On 8/24/13, Tom Bennett <tom.bennett@mail.zyzhu.net> wrote:
Hi All,
I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do np.dot(A,B), I get [nan, nan, nan].
However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False. And also np.seterr(all='print') does not print anything.
I am not wondering what is going on and how to avoid.
In case it is important, A and B are from the normal equation of doing regression. I am regressing 100,000 observations on 3 100,000 long factors.
Thanks, Tom
What are the data types of the arrays, and what are the typical sizes of the values in these arrays? I can get all nans from np.dot if the values are huge floating point values:
``` In [79]: x = 1e160*np.random.randn(3, 10)
In [80]: y = 1e160*np.random.randn(10)
In [81]: np.dot(x, y) Out[81]: array([ nan, nan, nan]) ```
...and that happens because some intermediate terms overflow to inf or -inf, and adding these gives nan:
``` In [89]: x = np.array([1e300])
In [90]: y = np.array([1e10])
In [91]: np.dot(x,y) Out[91]: inf
In [92]: x2 = np.array([1e300, 1e300])
In [93]: y2 = np.array([1e10,-1e10])
In [94]: np.dot(x2, y2) Out[94]: nan ```
Warren
Warren
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (3)
-
Pauli Virtanen
-
Tom Bennett
-
Warren Weckesser