[Numpy-discussion] Strategy for OpenBLAS

Wed May 27 04:41:15 EDT 2015

2015-05-27 10:26 GMT+02:00 Carl Kleffner <cmkleffner at gmail.com>:

>
>
> 2015-05-27 10:13 GMT+02:00 Nathaniel Smith <njs at pobox.com>:
>
>> On Tue, May 26, 2015 at 9:53 AM, Julian Taylor
>> <jtaylor.debian at googlemail.com> wrote:
>> > On 05/26/2015 04:56 PM, Matthew Brett wrote:
>> >> Hi,
>> >>
>> >> This morning I was wondering whether we ought to plan to devote some
>> >> resources to collaborating with the OpenBLAS team.
>> >>
>> >>
>> >>
>> >> It is relatively easy to add tests using Python / numpy.  We like
>> >> tests.  Why don't we propose a collaboration with OpenBLAS where we
>> >> build and test numpy with every / most / some commits of OpenBLAS, and
>> >> try to make it easy for the OpenBLAS team to add tests.    Maybe we
>> >> can use and add to the list of machines on which OpenBLAS is tested
>> >> [1]?  We Berkeley Pythonistas can certainly add the machines at our
>> >> buildbot farm [2].  Maybe the Julia / R developers would be interested
>> >> to help too?
>> >>
>> >
>>
> Some benchmark results made by @wernsaar can be found at
http://sourceforge.net/p/slurm-roll/code/HEAD/tree/branches/benchmark/ .
I guess this was made on Linux, so it cannot directly applied to Windows.
See i.e https://github.com/xianyi/OpenBLAS/issues/532. In general OpenBLAS
development trunk runs smoothly on Windows now.


> > Technically we only need a single machine with the newest instruction
>> > set available. All other cases could then be tested via a virtual
>> > machine that only exposes specific instruction sets (e.g. qemu which
>> > could technically also emulate stuff the host does not have).
>> >
>> > Concerning test generation there is a huge parameter space that needs
>> > testing due with openblas, at least some of it would need to be
>> > automated/fuzzed. We also need specific preconditioning of memory to
>> > test failure cases openblas had in the past, E.g. filling memory around
>> > the matrices with nans and also somehow filling openblas own temporary
>> > buffers with some signaling values (might require special built openblas
>> > if _MALLOC_PERTURB does not work).
>>
>> A lot of this stuff is easier if we take a white-box instead of
>> black-box approach -- adding hooks in OpenBLAS to override the
>> CPU-based kernel-autoselection sounds a lot easier than creating
>> unnatural machines in qemu, and similarly for initializing temporary
>> buffers. (I would be really unsurprised if OpenBLAS re-uses temporary
>> buffers across calls instead of doing a free/re-malloc, for example.)
>>
>> Manually overwriting the OpenBLAS CPU autoselection can easily be done by
> setting the OPENBLAS_CORETYPE environment variable, i.e.
> export OPENBLAS_CORETYPE=Nehalem
>
>
>> > Maybe it would be feasible to write a hypothesis [0] strategy for some
>> > of the blas stuff to automate the parameter exploration.
>>
>> Or if this is daunting, you can get pretty far just sitting down and
>> writing some for loops... I think this is a case where something is a
>> lot better than nothing :-).
>>
>> -n
>>
>> --
>> Nathaniel J. Smith -- http://vorpus.org
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150527/1e969dac/attachment.html>