Mailman 3 inplace dot products - NumPy-Discussion

newer
Superfluous array transpose (cf....

inplace dot products

older
Has IPython been useful to you?...

Olivier Grisel

Feb. 18, 2009

9:06 a.m.

Hi numpist people, I discovered the ufunc and there ability to compute the results on preallocated arrays:

...

...
...
a = arange(10, dtype=float32) b = arange(10, dtype=float32) + 1 c = add(a, b, a) c is a True a array([ 1., 3., 5., 7., 9., 11., 13., 15., 17., 19.], dtype=float32)

My questions is : is there a way to have an equivalent for the dot product operation: I want atlas to build my dot products without allocating a temporary array and reuse a preallocated array of results. Suppose I have:

...

...
...
results = array((10, 3), dtype=float32) W = arange(6, dtype=float32).reshape((2, 3)) x = arange(20, dtype=float32).reshape((10, 2))

What I want is the equivalent of the following without the intermediate call to malloc:

...

...
...
results[:] = dot(x, W)

Any idea? I tried to introspect the various docstring of numpy core modules but I could not get any lead. -- Olivier

Show replies by date

David Warde-Farley

February 2009

11:18 a.m.

Hi Olivier, There was this idea posted on the Scipy-user list a while back: http://projects.scipy.org/pipermail/scipy-user/2008-August/017954.html but it doesn't look like he got anywhere with it, or even got a response. I just tried it and I observe the same behaviour. A quick look at the SciPy sources tells me there is something fishy. subroutine < tchar=s,d,c,z>gemm(m,n,k,alpha,a,b,beta,c,trans_a,trans_b,lda,ka,ldb,kb) ! c = gemm(alpha,a,b,beta=0,c=0,trans_a=0,trans_b=0,overwrite_c=0) ! Calculate C <- alpha * op(A) * op(B) + beta * C I don't read Fortran very well, but it seems to me as though the Fortran prototype doesn't match the python prototype. I'll poke around a little more, but in summary: there's no numpy- sanctioned way to specify an output array for a dot(), AFAIK. This is a bit of an annoyance, I agree, though I seem to remember Robert Kern offering a fairly compelling argument why it's hard. I just don't know what that argument is :) David On 18-Feb-09, at 4:06 AM, Olivier Grisel wrote:

...

Hi numpist people,

I discovered the ufunc and there ability to compute the results on preallocated arrays:

...
...
...
a = arange(10, dtype=float32) b = arange(10, dtype=float32) + 1 c = add(a, b, a) c is a True a array([ 1., 3., 5., 7., 9., 11., 13., 15., 17., 19.], dtype=float32)

My questions is : is there a way to have an equivalent for the dot product operation: I want atlas to build my dot products without allocating a temporary array and reuse a preallocated array of results. Suppose I have:

...
...
...
results = array((10, 3), dtype=float32) W = arange(6, dtype=float32).reshape((2, 3)) x = arange(20, dtype=float32).reshape((10, 2))

What I want is the equivalent of the following without the intermediate call to malloc:

...
...
...
results[:] = dot(x, W)

Any idea? I tried to introspect the various docstring of numpy core modules but I could not get any lead.

-- Olivier _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Olivier Grisel

11:41 a.m.

2009/2/20 David Warde-Farley <dwf@cs.toronto.edu>:

...

Hi Olivier,

There was this idea posted on the Scipy-user list a while back:

http://projects.scipy.org/pipermail/scipy-user/2008-August/017954.html

but it doesn't look like he got anywhere with it, or even got a response.

I just tried it and I observe the same behaviour. A quick look at the SciPy sources tells me there is something fishy.

subroutine < tchar=s,d,c,z>gemm(m,n,k,alpha,a,b,beta,c,trans_a,trans_b,lda,ka,ldb,kb) ! c = gemm(alpha,a,b,beta=0,c=0,trans_a=0,trans_b=0,overwrite_c=0) ! Calculate C <- alpha * op(A) * op(B) + beta * C

I don't read Fortran very well, but it seems to me as though the Fortran prototype doesn't match the python prototype.

I'll poke around a little more, but in summary: there's no numpy- sanctioned way to specify an output array for a dot(), AFAIK. This is a bit of an annoyance, I agree, though I seem to remember Robert Kern offering a fairly compelling argument why it's hard. I just don't know what that argument is :)

Alright, thanks for the reply. Is there a canonical way /sample code to gain low level access to blas / lapack atlas routines using ctypes from numpy / scipy code? I don't mind fixing the dimensions and the ndtype of my array if it can decrease the memory overhead. BTW, Robert if your insight on this topic would be very much appreciated. -- Olivier

David Cournapeau

11:39 a.m.

Olivier Grisel wrote:

...

2009/2/20 David Warde-Farley <dwf@cs.toronto.edu>:

...
Hi Olivier,

There was this idea posted on the Scipy-user list a while back:

http://projects.scipy.org/pipermail/scipy-user/2008-August/017954.html

but it doesn't look like he got anywhere with it, or even got a response.

I just tried it and I observe the same behaviour. A quick look at the SciPy sources tells me there is something fishy.

subroutine < tchar=s,d,c,z>gemm(m,n,k,alpha,a,b,beta,c,trans_a,trans_b,lda,ka,ldb,kb) ! c = gemm(alpha,a,b,beta=0,c=0,trans_a=0,trans_b=0,overwrite_c=0) ! Calculate C <- alpha * op(A) * op(B) + beta * C

I don't read Fortran very well, but it seems to me as though the Fortran prototype doesn't match the python prototype.

I'll poke around a little more, but in summary: there's no numpy- sanctioned way to specify an output array for a dot(), AFAIK. This is a bit of an annoyance, I agree, though I seem to remember Robert Kern offering a fairly compelling argument why it's hard. I just don't know what that argument is :)

Alright, thanks for the reply.

Is there a canonical way /sample code to gain low level access to blas / lapack atlas routines using ctypes from numpy / scipy code?

You can just use ctypes to access ATLAS, as you would do for any library. Or do you mean something else ? cheers, David

David Warde-Farley

12:25 p.m.

On 20-Feb-09, at 6:39 AM, David Cournapeau wrote:

...

You can just use ctypes to access ATLAS, as you would do for any library. Or do you mean something else ?

Say, David... :) Do you have any idea why the pyf wrapper for fblas3 completely ignores the overwrite_c argument? Fiddling around I've found other blas/lapack functions where the same arg is offered but the choice actually does something. Regards, (Other) David

Robert Kern

3:39 p.m.

On Fri, Feb 20, 2009 at 06:25, David Warde-Farley <dwf@cs.toronto.edu> wrote:

...

On 20-Feb-09, at 6:39 AM, David Cournapeau wrote:

...
You can just use ctypes to access ATLAS, as you would do for any library. Or do you mean something else ?

Say, David... :)

Do you have any idea why the pyf wrapper for fblas3 completely ignores the overwrite_c argument?

I believe the "copy" is the culprit. double precision dimension(m,n),intent(in,out,copy),depend(m,n),optional :: c

...

Fiddling around I've found other blas/lapack functions where the same arg is offered but the choice actually does something.

Examples? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

David Warde-Farley

9:06 p.m.

On 20-Feb-09, at 10:39 AM, Robert Kern wrote:

...

...
Fiddling around I've found other blas/lapack functions where the same arg is offered but the choice actually does something.

Examples?

scipy.lib.lapack.flapack.dpotri, for example. I'm not sure of the proper usage, but when I pass it an identity matrix, depending on whether overwrite_c is True or not, the memory pointed to by the variable gets overwritten. David

David Warde-Farley

12:41 a.m.

On 20-Feb-09, at 10:39 AM, Robert Kern wrote:

...

...
Fiddling around I've found other blas/lapack functions where the same arg is offered but the choice actually does something.

Examples?

An even better example is scipy.linalg.fblas.dgemv, the matrix-vector equivalent of dgemm. overwrite_y behaves correctly there. David

David Warde-Farley

March 2009

8:11 a.m.

On 20-Feb-09, at 6:41 AM, Olivier Grisel wrote:

...

Alright, thanks for the reply.

Is there a canonical way /sample code to gain low level access to blas / lapack atlas routines using ctypes from numpy / scipy code?

I don't mind fixing the dimensions and the ndtype of my array if it can decrease the memory overhead.

I got some clarification from Pearu Peterson off-list. For gemm the issue is that if the matrix C is not Fortran-ordered, it will be copied, and that copy will be over-written. order='F' when creating the array being overwritten will fix this. DWF

Robert Kern

February 2009

3:52 p.m.

On Fri, Feb 20, 2009 at 05:18, David Warde-Farley <dwf@cs.toronto.edu> wrote:

...

Hi Olivier,

There was this idea posted on the Scipy-user list a while back:

http://projects.scipy.org/pipermail/scipy-user/2008-August/017954.html

but it doesn't look like he got anywhere with it, or even got a response.

I just tried it and I observe the same behaviour. A quick look at the SciPy sources tells me there is something fishy.

subroutine < tchar=s,d,c,z>gemm(m,n,k,alpha,a,b,beta,c,trans_a,trans_b,lda,ka,ldb,kb) ! c = gemm(alpha,a,b,beta=0,c=0,trans_a=0,trans_b=0,overwrite_c=0) ! Calculate C <- alpha * op(A) * op(B) + beta * C

I don't read Fortran very well, but it seems to me as though the Fortran prototype doesn't match the python prototype.

What do you mean? Based on the rest of the argument information in that block, f2py creates the Python prototype. For example, all of the m,n,k,lda,ka,ldb,kb dimensions are found from the input arrays themselves, optional arguments are given defaults, etc.

...

I'll poke around a little more, but in summary: there's no numpy- sanctioned way to specify an output array for a dot(), AFAIK. This is a bit of an annoyance, I agree, though I seem to remember Robert Kern offering a fairly compelling argument why it's hard. I just don't know

I believe I was only talking about why it would be hard to use a higher-precision accumulator. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

5822

Age (days ago)

5848

Last active (days ago)

List overview

Download

9 comments

4 participants

participants (4)

David Cournapeau
David Warde-Farley
Olivier Grisel
Robert Kern

inplace dot products

Olivier Grisel

David Warde-Farley

Olivier Grisel

David Cournapeau

David Warde-Farley

Robert Kern

David Warde-Farley

David Warde-Farley

David Warde-Farley

Robert Kern

tags

participants (4)