[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)

James Philbin philbinj at gmail.com
Sun Mar 23 14:22:48 EDT 2008


OK, i'm really impressed with the improvements in vectorization for
gcc 4.3. It really seems like it's able to work with real loops which
wasn't the case with 4.1. I think Chuck's right that we should simply
special case contiguous data and allow the auto-vectorizer to do the
rest. Something like this for the ufuncs:

 /**begin repeat

   #TYPE=(BOOL,
BYTE,UBYTE,SHORT,USHORT,INT,UINT,LONG,ULONG,LONGLONG,ULONGLONG,FLOAT,DOUBLE,LONGDOUBLE)*2#
   #OP=||, +*13, ^, -*13#
   #kind=add*14, subtract*14#
   #typ=(Bool, byte, ubyte, short, ushort, int, uint, long, ulong,
longlong, ulonglong, float, double, longdouble)*2#
*/

static void
@TYPE at _@kind at _contig(@typ@ *i1, @typ@ *i2, @type@ *op, int n)
{
   int i;
   for (i=0; i<n; i++) {
      op[i] = i1[i] @OP@ i2[i];
   }
}

static void
@TYPE at _@kind@(char **args, intp *dimensions, intp *steps, void *func)
{
    register intp i;
    intp is1=steps[0],is2=steps[1],os=steps[2], n=dimensions[0];
    char *i1=args[0], *i2=args[1], *op=args[2];

    if (is1==1 && is2==1 && os==1)
        return @TYPE at _@kind at _contig((@typ@ *)i1, (@typ@ *)i2, (@typ@ *)os, n);

    for(i=0; i<n; i++, i1+=is1, i2+=is2, op+=os) {
        *((@typ@ *)op)=*((@typ@ *)i1) @OP@ *((@typ@ *)i2);
    }
}
/**end repeat**/

We also need to add -ftree-vectorize to the standard compile flags at
least for the ufuncs.

James



More information about the NumPy-Discussion mailing list