[Numpy-discussion] Patch for accelerate cblas_sgemv (Re: Reporting a bug to Apple)

Sturla Molden sturla.molden at gmail.com
Tue Jun 10 17:48:08 EDT 2014


On 10/06/14 14:57, Matthew Brett wrote:

 > Would you consider doing a PR for that?

Here is a patch you can try before I post a PR.

It can also be build independently of NumPy, so you don't need
to rebuild NumPy just for testing it. (The change to numpy is in a 
different folder.)

I decided against using cblas_sgemm. Instead it just enforces alignment 
to 32 byte boundaries. Because cblas_sgemm would require a copy if the
vector is strided, it didn't matter.

I have tested with Accelerate, OpenBLAS and MKL, clang and icc. From 
what I can tell it works correctly and does not segfault on misalignment.


Sturla
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sgemv_patch.zip
Type: application/zip
Size: 54470 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140610/0805f8aa/attachment.zip>


More information about the NumPy-Discussion mailing list