[Numpy-discussion] low level optimization in NumPy and minivect
Charles R Harris
charlesr.harris at gmail.com
Wed Jun 19 08:48:32 EDT 2013
On Wed, Jun 19, 2013 at 5:45 AM, Matthew Brett <matthew.brett at gmail.com>wrote:
> Hi,
>
> On Wed, Jun 19, 2013 at 1:43 AM, Frédéric Bastien <nouiz at nouiz.org> wrote:
> > Hi,
> >
> >
> > On Mon, Jun 17, 2013 at 5:03 PM, Julian Taylor
> > <jtaylor.debian at googlemail.com> wrote:
> >>
> >> On 17.06.2013 17:11, Frédéric Bastien wrote:
> >> > Hi,
> >> >
> >> > I saw that recently Julian Taylor is doing many low level optimization
> >> > like using SSE instruction. I think it is great.
> >> >
> >> > Last year, Mark Florisson released the minivect[1] project that he
> >> > worked on during is master thesis. minivect is a compiler for
> >> > element-wise expression that do some of the same low level
> optimization
> >> > that Julian is doing in NumPy right now.
> >> >
> >> > Mark did minivect in a way that allow it to be reused by other
> project.
> >> > It is used now by Cython and Numba I think. I had plan to reuse it in
> >> > Theano, but I didn't got the time to integrate it up to now.
> >> >
> >> > What about reusing it in NumPy? I think that some of Julian
> optimization
> >> > aren't in minivect (I didn't check to confirm). But from I heard,
> >> > minivect don't implement reduction and there is a pull request to
> >> > optimize this in NumPy.
> >>
> >> Hi,
> >> what I vectorized is just the really easy cases of unit stride
> >> continuous operations, so the min/max reductions which is now in numpy
> >> is in essence pretty trivial.
> >> minivect goes much further in optimizing general strided access and
> >> broadcasting via loop optimizations (it seems to have a lot of overlap
> >> with the graphite loop optimizer available in GCC [0]) so my code is
> >> probably not of very much use to minivect.
> >>
> >> The most interesting part in minivect for numpy is probably the
> >> optimization of broadcasting loops which seem to be pretty inefficient
> >> in numpy [0].
> >>
> >> Concerning the rest I'm not sure how much of a bottleneck general
> >> strided operations really are in common numpy using code.
> >>
> >>
> >> I guess a similar discussion about adding an expression compiler to
> >> numpy has already happened when numexpr was released?
> >> If yes what was the outcome of that?
> >
> >
> > I don't recall a discussion when numexpr was done as this is before I
> read
> > this list. numexpr do optimization that can't be done by NumPy: fusing
> > element-wise operation in one call. So I don't see how it could be done
> to
> > reuse it in NumPy.
> >
> > You call your optimization trivial, but I don't. In the git log of NumPy,
> > the first commit is in 2001. It is the first time someone do this in 12
> > years! Also, this give 1.5-8x speed up (from memory from your PR
> > description). This is not negligible. But how much time did you spend on
> > them? Also, some of them are processor dependent, how many people in this
> > list already have done this? I suppose not many.
> >
> > Yes, your optimization don't cover all cases that minivect do. I see 2
> level
> > of optimization. 1) The inner loop/contiguous cases, 2) the strided,
> > broadcasted level. We don't need all optimization being done for them to
> be
> > useful. Any of them are useful.
> >
> > So what I think is that we could reuse/share that work. NumPy have c code
> > generator. They could call minivect code generator for some of them when
> > compiling NumPy. This will make optimization done to those code generator
> > reused by more people. For example, when new processor are launched, we
> will
> > need only 1 place to change for many projects. Or for example, it the
> call
> > to MKL vector library is done there, more people will benefit from it.
> Right
> > now, only numexpr do it.
> >
> > About the level 2 optimization (strides, broadcast), I never read NumPy
> code
> > that deal with that. Do someone that know it have an idea if it would be
> > possible to reuse minivect for this?
>
> Would someone be able to guide some of the numpy C experts into a room
> to do some thinking / writing on this at the scipy conference?
>
> I completely agree that these kind of optimizations and code sharing
> seem likely to be very important for the future.
>
> I'm not at the conference, but if there's anything I can do to help,
> please someone let me know.
>
Concerning the future development of numpy, I'd also suggest that we look
at libdynd <https://github.com/ContinuumIO/libdynd>. It looks to me like it
is reaching a level of maturity where it is worth trying to plan out a long
term path to merger.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130619/38f0fba7/attachment.html>
More information about the NumPy-Discussion
mailing list