<div dir="ltr">On Wed, Mar 7, 2018 at 1:10 PM, Marko Asplund <<a href="mailto:marko.asplund@gmail.com">marko.asplund@gmail.com</a>> wrote:<br>><br>> However, the results look very different when using random initialization.<br>> With respect to exact cost this is course expected, but what I find troublesome<br>> is that  after N training iterations the cost starts approaching zero with the NumPy<br>> code (most of of the time), whereas with the Scala based implementations cost fails<br>> to converge (most of the time).<br>><br>> With NumPy I'm simply using the following random initilization code:<br>><br>> np.random.randn(n_h, n_x) * 0.01<br>><br>> I'm trying to emulate the same behaviour in my Scala code by  sampling from a<br>> Gaussian distribution with mean = 0 and std dev = 1.<br><br>`np.random.randn(n_h, n_x) * 0.01`  gives a Gaussian distribution of mean=0 and stdev=0.01<br><br>--<br>Robert Kern</div>