inplace dot products (Robert Kern) (Re: Numpy-discussion Digest, Vol 29, Issue 69)

Hello all, I've been toying with the idea of an extended precision accumulator for the dot product written in numpy/core/src/multiarraymodule.c Once the modification is being performed, there is no reason not to allow the specification of an output array. The functions that exist now: The routine cumsum already exists with an accumulator of a specific type. The output location is an optional argument. numpy.cumsum(a, axis=None, dtype=None, out=None) Various forms of inner product (dot product) also exist: numpy.tensordot(a, b, axes=2) Returns the tensor dot product for (ndim >= 1) arrays along an axes. The first element of the sequence determines the axis or axes in `a` to sum over, and the second element in `axes` argument sequence determines the axis or axes in `b` to sum over. numpy.vdot(a,b) Returns the dot product of a and b for scalars and vectors of floating point and complex types. The first argument, a, is conjugated. numpy.innerproduct(a,b) Returns the inner product of a and b for arrays of floating point types. Like the generic NumPy equivalent the product sum is over the last dimension of a and b. NB: The first argument is not conjugated. The proposed extensions: numpy.tensordot(a, b, axes=2, dtype=None, out=None) numpy.vdot(a,b, dtype=None, out=None) numpy.innerproduct(a,b, dtype=None, out=None) (found in numpy/core/ src/multiarraymodule.c) Is this so difficult an extension to implement? Is there a better functional specification to be made? Granted, these routines do not exist in blas. Therefore, they wouldn't be speedy, at least without blas extensions. But... the key routines to make the generic extension already exist in numpy without blas. from numeric.py: When Numpy is built with an accelerated BLAS like ATLAS, these functions are replaced to make use of the faster implementations. The faster implementations only affect float32, float64, complex64, and complex128 arrays. Furthermore, the BLAS API only includes matrix-matrix, matrix-vector, and vector-vector products. Products of arrays with larger dimensionalities use the built in functions and are not accelerated. Looking at numpy/core/src/multiarraymodule.c: The existing routine that allows an accumulator specification and output array: static PyObject * PyArray_Sum(PyArrayObject *self, int axis, int rtype, PyArrayObject *out) The routine that would be modified and its new function prototypes: static PyObject * PyArray_InnerProduct(PyObject *op1, PyObject *op2, int rtype, PyArrayObject *out) Other C code that might be modified to support the new functionality is in arraytypes.inc.src. Routines here perform a dot product along one dimension for various argument dtypes and accumulator dtypes. It looks like the accumulator type is encoded in the #out signature. As I read it, there is no change needed to this source as all the different accumulator types are present in the output signature. /**begin repeat #name=BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG, LONGLONG, ULONGLONG, FLOAT, DOUBLE, LONGDOUBLE# #type= byte, ubyte, short, ushort, int, uint, long, ulong, longlong, ulonglong, float, double, longdouble# #out= long, ulong, long, ulong, long, ulong, long, ulong, longlong, ulonglong, float, double, longdouble# */ static void @name@_dot(char *ip1, intp is1, char *ip2, intp is2, char *op, intp n, So far, the changes dont seem to be too messy. Am I missing anything here? David On Feb 20, 2009, at 10:00 AM, numpy-discussion-request@scipy.org wrote:
Message: 2 Date: Fri, 20 Feb 2009 09:52:03 -0600 From: Robert Kern <robert.kern@gmail.com> Subject: Re: [Numpy-discussion] inplace dot products To: Discussion of Numerical Python <numpy-discussion@scipy.org> Message-ID: <3d375d730902200752k6b71efc9gb1e0ac1d260d65d@mail.gmail.com> Content-Type: text/plain; charset=UTF-8
On Fri, Feb 20, 2009 at 05:18, David Warde-Farley <dwf@cs.toronto.edu> wrote:
Hi Olivier,
There was this idea posted on the Scipy-user list a while back:
http://projects.scipy.org/pipermail/scipy-user/2008-August/ 017954.html
but it doesn't look like he got anywhere with it, or even got a response.
I just tried it and I observe the same behaviour. A quick look at the SciPy sources tells me there is something fishy.
subroutine < tchar=s,d,c,z>gemm (m,n,k,alpha,a,b,beta,c,trans_a,trans_b,lda,ka,ldb,kb) ! c = gemm(alpha,a,b,beta=0,c=0,trans_a=0,trans_b=0,overwrite_c=0) ! Calculate C <- alpha * op(A) * op(B) + beta * C
I don't read Fortran very well, but it seems to me as though the Fortran prototype doesn't match the python prototype.
What do you mean? Based on the rest of the argument information in that block, f2py creates the Python prototype. For example, all of the m,n,k,lda,ka,ldb,kb dimensions are found from the input arrays themselves, optional arguments are given defaults, etc.
I'll poke around a little more, but in summary: there's no numpy- sanctioned way to specify an output array for a dot(), AFAIK. This is a bit of an annoyance, I agree, though I seem to remember Robert Kern offering a fairly compelling argument why it's hard. I just don't know
I believe I was only talking about why it would be hard to use a higher-precision accumulator.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
End of Numpy-discussion Digest, Vol 29, Issue 69 ************************************************
participants (1)
-
David Henderson