Micro NumPy & VecOpt 2.0
data:image/s3,"s3://crabby-images/96e9c/96e9c79226a48c99de8ef5fd4899758aea3c25cb" alt=""
June 26, 2016
6:33 a.m.
Hi, in case you have not heard: I'm currently working on the PPC and S390X port for micro numpy. Thanks to IBM for funding this work. I'm ~50% through the ppc operations to implement. The goal is to turn this optimization on (by default) in the micro numpy module. I recently had the idea to enhance the jit driver by giving it more information about parallel execution. I'm *not* talking about the main interp. loop. Having a vectorized loop that executes parallel in threads would certainly push micronumpy performance. Has somebody already tried something similar? I think it is a challenge, but it should be possible (with a reasonable amount of work) to get a simple thread fork/join model such as OpenMP provides. Cheers, Richard
3166
Age (days ago)
3166
Last active (days ago)
0 comments
1 participants
participants (1)
-
Richard Plangger