[Numpy-discussion] Numpy-discussion Digest, Vol 6, Issue 20

James A. Bednar jbednar at inf.ed.ac.uk
Fri Mar 9 15:04:56 EST 2007


|  Date: Fri, 9 Mar 2007 06:58:32 -0800
|  From: "Sebastian Haase" <haase at msg.ucsf.edu>
|  Subject: Re: [Numpy-discussion] Numpy-discussion Digest, Vol 6, Issue 18
|  To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
|  
|  On 3/9/07, James A. Bednar <jbednar at inf.ed.ac.uk> wrote:
|  > |  From: Robert Kern <robert.kern at gmail.com>
|  > |  Subject: Re: [Numpy-discussion] in place random generation
|  > |
|  > |  Daniel Mahler wrote:
|  > |  > On 3/8/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
|  > |
|  > |  >> Robert thought this might relate to Travis' changes adding
|  > |  >> broadcasting to the random number generator. It does seem
|  > |  >> certain that generating small arrays of random numbers has a
|  > |  >> very high overhead.
|  > |  >
|  > |  > Does that mean someone is working on fixing this?
|  > |
|  > |  It's not on the top of my list, no.
|  >
|  > I just wanted to put in a vote saying that generating a large quantity
|  > of small arrays of random numbers is quite important in my field, and
|  > is something that is definitely slowing us down right now.
|  >
|  > We often simulate neural networks whose many, many small weight
|  > matrices need to be initialized with random numbers, and we are seeing
|  > quite slow startup times (on the order of minutes, even though
|  > reloading a pickled snapshot of the same simulation once it has been
|  > initialized takes only a few seconds).
|  >
|  > The quality of these particular random numbers doesn't matter very
|  > much for us, so we are looking for some cheaper way to fill a bunch of
|  > small matrices with at least passably random values.  But it would of
|  > course be better if the regular high-quality random number support in
|  > Numpy were speedy under these conditions...
|
|  Hey Jim,
|  
|  Could you not create all the many arrays to use "one large chunck" of
|  contiguous memory ?
|  like: 1) create a large 1D array
|  2) create all small arrays in a for loop using
|  numpy.ndarray(buffer=largeArray[offset], shape=..., dtype=...)  ---
|  you increment offset appropriately during the loop
|  3) then you can reset all small arrays to  new random numbers with one
|  call to resetting the large array ((they all have the same statistics
|  (mean,stddev, type), right ?

In principle, I *think* we could make that work.  But we maintain a
large object-oriented toolkit for computational neuroscience (see
topographica.org), and we try to let each object take care of its own
business as much as possible, so that people can later swap things out
with their own customized versions.  That's hard to do if we set up
global dependencies like this, and the results are quite difficult to
maintain.

Of course, we can and do put in optimizations for certain special
cases, but I suspect that it will be simpler in this case just to add
some fast-and-dirty but general way to fill small arrays with random
values.  Still, it would be much simpler for us if the basic numpy
small random array support had less overhead...

Jim



More information about the NumPy-Discussion mailing list