[SciPy-User] Using GEMM with specified output-array

Thomas Unterthiner thomas_unterthiner at web.de
Wed Mar 12 06:21:34 EDT 2014


On 2014-03-11 19:23, Sturla Molden wrote:
> Thomas Unterthiner <thomas_unterthiner at web.de> wrote:
>> Hi there!
>>
>> I would like to use *GEMM without allocating an output-array. So I used
>> the functions from scipy.linalg.blas, however it seems as if the
>> 'overwrite_c' argument gets completely ignored:
>>
>>       from scipy.linalg.blas import sgemm
>>       X = np.random.randn(30, 50).astype(np.float32)
>>       Y = np.random.randn(50, 30).astype(np.float32)
>>       Z = np.zeros((X.shape[0], Y.shape[1]), dtype=np.float32)
>>       res = f(1.0, X, Y, c=Z, trans_a=0, trans_b=0, overwrite_c=1)
>>       assert res == np.dot(X, Y).all()
>>       print res is Z
>>
>> prints "False", meaning a new array is allocated for the result.
>> However, I want my result to be written into 'Z'. Before anyone asks: I
>> can't use np.dot's 'out' argument since I also want to specify 'alpha'.
>>
>> Is there anything I'm missing here?  (I'm using scipy 0.13.3 / numpy
>> 1.8.0 with OpenBLAS on Ubuntu 13.10).
>
> You are passing C contiguous arrays to Fortran. f2py will copy and
> transpose them. If you want to avoid this, always pass Fortran order arrays
> to scipy.linalg.* functions. You create a Fortran order view of a C array
> by taking .T (transpose is O(1) in NumPy) or create the array with keyword
> argument order='F'.  Check that .flags['F_CONTIGUOUS'] is True for all the
> arrays you pass to scipy.linalg.
>
> The "overwrite_c" option in GEMM is just a suggestion. It is not enforced.
> If f2py has to copy and transpose your C input array it does nothing.
>
> Sturla
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

Thanks, that was helpful! :)  I tried:

     from scipy.linalg.blas import sgemm
     X = np.random.randn(30, 50).astype(np.float32).T
     Y = np.random.randn(50, 30).astype(np.float32).T
     Z = np.zeros((X.shape[0], Y.shape[1]), dtype=np.float32).T
     res = sgemm(1.0, Y, X, c=Z, trans_a=0, trans_b=0, overwrite_c=1)
     assert (res.T == np.dot(X.T, Y.T)).all(), "not equal"
     assert (X.flags['F_CONTIGUOUS'], Y.flags['F_CONTIGUOUS'], 
Z.flags['F_CONTIGUOUS']) == (True, True, True)
     print res is Z

However, this gives me "failed in converting 2nd keyword `c' of 
_fblas.sgemm to C/Fortran array". I assume this is because 
Z.flags['OWNDATA'] is False, due to the transpose. Is that correct?

The problem is that other parts of my program expect all my matrices in 
row-major order, so  sing `order="F"`  is not really an option for me. 
Is there any way around this?

Cheers

Thomas








More information about the SciPy-User mailing list