[Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration"

Neal Becker ndbecker2 at gmail.com
Thu Mar 12 08:14:17 EDT 2015

Ralf Gommers wrote:

> On Wed, Mar 11, 2015 at 11:20 PM, Dp Docs <sdpan21 at gmail.com> wrote:
>> On Thu, Mar 12, 2015 at 2:01 AM, Daπid <davidmenhur at gmail.com> wrote:
>> >
>> > On 11 March 2015 at 16:51, Dp Docs <sdpan21 at gmail.com> wrote:
>> >> On Wed, Mar 11, 2015 at 7:52 PM, Sturla Molden
>> >> <sturla.molden at gmail.com>
>> wrote:
>> >> >
>> >> > There are at least two ways to proceed here. One is to only use
>> >> > vector math when strides and alignment allow it.
>> >> I didn't got it. can you explain in detail?
>> >
>> >
>> > One example, you can create a numpy 2D array using only the odd columns
>> of a matrix.
>> >
>> > odd_matrix = full_matrix[::2, ::2]
>> >
>> > This is just a view of the original data, so you save the time and the
>> memory of making a copy. The drawback is that you trash
>> ​>​
>> memory locality, as the elements are not contiguous in memory. If the
>> memory is guaranteed to be contiguous, a compiler can apply
>> ​>​
>> extra optimisations, and this is what vector libraries usually assume.
>> What I think Sturla is suggesting with "when strides and aligment
>> ​>​
>> allow it" is to use the fast version if the array is contiguous, and fall
>> back to the present implementation otherwise. Another would be to
>> ​>​
>> make an optimally aligned copy, but that could eat up whatever time we
>> save from using the faster library, and other problems.
>> >
>> > The difficulty with Numpy's strides is that they allow so many ways of
>> manipulating the data... (alternating elements, transpositions, different
>> precisions...).
>> >
>> >>
>> >> I think the actual problem is not "to choose which library to
>> integrate", it is how to integrate these libraries? as I have seen the
>> code ​>>​
>> base and been told the current implementation uses the c math library,
>> Can
>> we just use the current  implementation and whenever it
>> ​>>​
>> is calling C Maths functions, we will replace by these above fast library
>> functions?Then we have to modify the Numpy library (which
>> ​>>​
>> usually get imported for maths operation) by using some if else
>> conditions
>> like first work with the faster one  and if it is not available
>> ​>>​
>> the look for the Default one.
>> >
>> >
>> > At the moment, we are linking to whichever LAPACK is avaliable at
>> compile time, so no need for a runtime check. I guess it could
>> ​>​
>> (should?) be the same.
>> ​I didn't understand this. I was asking about let say I have chosen one
>> faster library, now I need to integrate this​ in *some way *without
>> changing the default functionality so that when Numpy will import "from
>> numpy import *",it should be able to access the integrated libraries
>> functions as well as default libraries functions, What should we be that*
>> some way*?​ Even at the Compile, it need to decide that which Function it
>> is going to use, right?
> Indeed, it should probably work similar to how BLAS/LAPACK functions are
> treated now. So you can support multiple libraries in numpy (pick only one
> to start with of course), but at compile time you'd pick the one to use.
> Then that library gets always called under the hood, i.e. no new public
> functions/objects in numpy but only improved performance of existing ones.
> It have been discussed above about integration of MKL libraries but when
>> MKL is not available on the hardware Architecture, will the above library
>> support as default library? if yes, then the Above discussed integration
>> method may be the required one for integration in this project, right?
>> Can you please tell me a bit more or provide some link related to that?​
>> Availability of these faster Libraries depends on the Hardware
>> Architectures etc. or availability of hardware Resources in a System?
>> because if it is later one, this newly integrated library will support
>> operations some time while sometimes not?
> Not HW resources I'd think. Looking at http://www.yeppp.info, it supports
> all commonly used cpus/instruction sets.
> As long as the accuracy of the library is OK this should not be noticeable
> to users except for the difference in performance.
>> I believe it's the first one but it is better to clear any type of
>> confusion. For example, assuming availability of Hardware means later
>> one,
>>  let say if library A needed the A1 for it's support and A1 is busy then
>>  it
>> will not be able to support the operation. Meanwhile, library B, needs
>> Support of hardware type B1 , and it's not Busy then it will support
>> these operations. What I want to say is Assuming the Availability of
>> faster lib. means availability of hardware Resources in a System at a
>> particular time when we want to do operation, it's totally unpredictable
>> and Availability of these resources will be Random and even worse, if it
>> take a bit extra time between compile and running, and that h/d resource
>> have been allocated to other process in the meantime then it would be
>> very problematic to use these operations. So this leads to think that
>> Availability of lib. means type of h/d architecture whether it supports
>> or not that lib. Since there are many kind of h/d architecture and it is
>> not the case that one library support all these architectures (though it
>> may be), So we need to integrate more than one lib. for providing support
>> to all kind of architecture (in ideal case which will make it to be a
>> very big project).
>> >
>> >>
>> >> Moreover, I have Another Doubt also. are we suppose to integrate just
>> one fast library or more than one so that if one is not available, look
>> for the second one and if second is not available then either go to
>> default are look for the third one if available?
>> >> Are we suppose to think like this: Let say "exp" is faster in sleef
>> library so integrate sleef library for this operation and let say "sin"
>> is faster in any other library, so integrate that library for sin
>> operation? I mean, it may be possible that different operations are
>> faster in different libraries So the implementation should be operation
>> oriented or just integrate one complete library?Thanks
>> >
>> >
>> > Which one is faster depends on the hardware, the version of the
>> > library,
>> and even the size of the problem:
>> > http://s3.postimg.org/wz0eis1o3/single.png
>> >
>> > I don't think you can reliably decide ahead of time which one should go
>> for each operation. But, on the other hand, whichever one you
>> ​>​
>> go for will probably be fast enough for anyone using Python. Most of the
>> work here is adapting Numpy's machinery to dispatch a call to
>> ​>​
>> the vector library, once that is ready, adding another one will hopefully
>> be easier. At least, at the moment Numpy can use one of
>> ​>​
>> several linear algebra packages (MKL, ATLAS, CBLAS...) and they are
>> added, I think, without too much pain (but maybe I am just far
>> ​>​
>> away from the screams of whoever did it).
>> >
>> ​​​So we are supposed to integrate just one of these libraries?(rest will
>> use default if they didn't support) ​MKL seems to be good but as it have
>> been discussed above that it's non-free and it have been integrated also,
>> can you suggest any other library which at least approximate MKL in a
>> better way? Though Eigen seems to be good, but it's seems to be worse in
>> middle ranges. can you provide any link which provide comparative
>> information about all available vector libraries(Free)?​​
> The idea on the GSoC page suggests http://www.yeppp.info/ or SLEEF (
> http://shibatch.sourceforge.net/). Based on those websites I'm 99.9% sure
> that yeppp is a better bet. At least its benchmarks say that it's faster
> than MKL. As for the project, Julian (who'd likely be the main mentor) has
> already indicated when suggesting the idea that he has no interest in a
> non-free library:
> http://comments.gmane.org/gmane.comp.python.numeric.general/56933. So
> Yeppp + the build architecture to support multiple libraries later on
> would probably be a good target.
> Cheers,
> Ralf

Thanks for the tip about yeppp.  While it looks interesting, it seems to be 
pretty limited.  Just a few transcendental functions.  I didn't notice 
complex either (e.g., dot product). 


Those who fail to understand recursion are doomed to repeat it

More information about the NumPy-Discussion mailing list