<div dir="ltr"><div>On Tue, 6 Mar 2018 12:52:14, Robert Kern wrote:<br></div><div><div>> I would just recommend using one of the codebases to initialize the</div><div>> network, save the network out to disk, and load up the initialized network</div><div>> in each of the different codebases for training. That way you are sure that</div><div>> they are both starting from the same exact network parameters.</div><div>></div><div>> Even if you do rewrite a precisely equivalent np.random.randn() for</div><div>> Scala/Java, you ought to write the code to serialize the initialized</div><div>> network anyways so that you can test that the two initialization routines</div><div>> are equivalent. But if you're going to do that, you might as well take my</div><div>> recommended approach.</div></div><div><br></div><div><div>Thanks for the suggestion! I decided to use the approach you proposed.</div><div><br></div><div>Still, I'm puzzled by an issue that seems to be related to random initilization.</div><div>I've three different NN implementations, 2 in Scala and one in NumPy.</div><div>When using the exact same initialization parameters I get the same</div><div>cost after each training iteration from each implementation. So, based on this</div><div>I'd infer that the implementations work equivalently.</div><div>However, the results look very different when using random initialization.</div><div>With respect to exact cost this is course expected, but what I find troublesome</div><div>is that  after N training iterations the cost starts approaching zero with the NumPy</div><div>code (most of of the time), whereas with the Scala based implementations cost fails</div><div>to converge (most of the time).</div><div><br></div><div>With NumPy I'm simply using the following random initilization code:</div><div><br></div><div>np.random.randn(n_h, n_x) * 0.01</div><div><br></div><div>I'm trying to emulate the same behaviour in my Scala code by  sampling from a</div><div>Gaussian distribution with mean = 0 and std dev = 1.</div><div><br></div><div>Any ideas?</div></div><div><br></div><div>Marko</div></div>