I've some neural network code in NumPy that I'd like to compare with a Scala based implementation. My problem is currently random initialization of the neural net parameters. I'd like to be able to get the same results from both implementations when using the same random seed. One approach I've though of would be to use the NumPy random generator also with the Scala implementation, but unfortunately the linear algebra library I'm using doesn't provide an equivalent for this. Could someone give pointers to implementing numpy.random.randn? Or alternatively, is there an equivalent random generator for Scala or Java? marko
On Tue, Mar 6, 2018 at 1:39 AM, Marko Asplund <marko.asplund@gmail.com> wrote:
I've some neural network code in NumPy that I'd like to compare with a
My problem is currently random initialization of the neural net
I'd like to be able to get the same results from both implementations when using the same random seed.
One approach I've though of would be to use the NumPy random generator also with the Scala implementation, but unfortunately the linear algebra
Scala based implementation. parameters. library I'm using doesn't provide an equivalent for this.
Could someone give pointers to implementing numpy.random.randn? Or alternatively, is there an equivalent random generator for Scala or
Java? I would just recommend using one of the codebases to initialize the network, save the network out to disk, and load up the initialized network in each of the different codebases for training. That way you are sure that they are both starting from the same exact network parameters. Even if you do rewrite a precisely equivalent np.random.randn() for Scala/Java, you ought to write the code to serialize the initialized network anyways so that you can test that the two initialization routines are equivalent. But if you're going to do that, you might as well take my recommended approach. -- Robert Kern
I would just recommend using one of the codebases to initialize the network, save the network out to disk, and load up the initialized network in each of the different codebases for training. That way you are sure
On Tue, 6 Mar 2018 12:52:14, Robert Kern wrote: that
they are both starting from the same exact network parameters.
Even if you do rewrite a precisely equivalent np.random.randn() for Scala/Java, you ought to write the code to serialize the initialized network anyways so that you can test that the two initialization routines are equivalent. But if you're going to do that, you might as well take my recommended approach.
Thanks for the suggestion! I decided to use the approach you proposed. Still, I'm puzzled by an issue that seems to be related to random initilization. I've three different NN implementations, 2 in Scala and one in NumPy. When using the exact same initialization parameters I get the same cost after each training iteration from each implementation. So, based on this I'd infer that the implementations work equivalently. However, the results look very different when using random initialization. With respect to exact cost this is course expected, but what I find troublesome is that after N training iterations the cost starts approaching zero with the NumPy code (most of of the time), whereas with the Scala based implementations cost fails to converge (most of the time). With NumPy I'm simply using the following random initilization code: np.random.randn(n_h, n_x) * 0.01 I'm trying to emulate the same behaviour in my Scala code by sampling from a Gaussian distribution with mean = 0 and std dev = 1. Any ideas? Marko
On Wed, Mar 7, 2018 at 1:10 PM, Marko Asplund <marko.asplund@gmail.com> wrote:
However, the results look very different when using random initialization. With respect to exact cost this is course expected, but what I find
troublesome
is that after N training iterations the cost starts approaching zero with the NumPy code (most of of the time), whereas with the Scala based implementations cost fails to converge (most of the time).
With NumPy I'm simply using the following random initilization code:
np.random.randn(n_h, n_x) * 0.01
I'm trying to emulate the same behaviour in my Scala code by sampling from a Gaussian distribution with mean = 0 and std dev = 1.
`np.random.randn(n_h, n_x) * 0.01` gives a Gaussian distribution of mean=0 and stdev=0.01 -- Robert Kern
participants (2)
-
Marko Asplund
-
Robert Kern