[Numpy-discussion] Patch for accelerate cblas_sgemv (Re: Reporting a bug to Apple)
Sturla Molden
sturla.molden at gmail.com
Tue Jun 10 17:48:08 EDT 2014
On 10/06/14 14:57, Matthew Brett wrote:
> Would you consider doing a PR for that?
Here is a patch you can try before I post a PR.
It can also be build independently of NumPy, so you don't need
to rebuild NumPy just for testing it. (The change to numpy is in a
different folder.)
I decided against using cblas_sgemm. Instead it just enforces alignment
to 32 byte boundaries. Because cblas_sgemm would require a copy if the
vector is strided, it didn't matter.
I have tested with Accelerate, OpenBLAS and MKL, clang and icc. From
what I can tell it works correctly and does not segfault on misalignment.
Sturla
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sgemv_patch.zip
Type: application/zip
Size: 54470 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140610/0805f8aa/attachment.zip>
More information about the NumPy-Discussion
mailing list